#1
|
|||
|
|||
Removing duplicates
Hi Folks,
I have an excel sheet. In column A, I have 200 names(Firstname Lastname format). I want to remove duplicates from this column. Now the challenge is somes firstname and last names gets interchanged(e.g. value in A3 is John Smith and value is A80 is Smith John). Ideally its the same person and hence i have to keep his name only once. So can anyone please assist me in removing such interchanged names duplicates. Thanks. |
#2
|
|||
|
|||
Hmmmn I will need to see all the names to see if the code I am thinking will work. Can you post a sample book and I can work on this for you.
Thanks |
#3
|
|||
|
|||
HI,
As excelledsoftware mention. If you can attach a workbook so we can see what the data is we may be able to help. |
#4
|
|||
|
|||
Hi,
I have attached an excel sheet. In column A you will find few names one below another. Few names are repeate and few names are repeated with interchnaged first and last names. e.g. A4= Ethan Anderson and A10= Anderson Ethan. I am facing challange here in removing such duplicate names. Kindly assist. Thanks, |
#5
|
|||
|
|||
Hi,
Thanks for the file. But, do you have something in the other columns? Can you provide more info for the other columns? |
#6
|
|||
|
|||
Charles, Ive got the idea on how to do this but you look pretty interested. Do you want to take a stab at it before I submit the code?
|
#7
|
|||
|
|||
Public Sub Test()
Dim h, i, j, k As Long, Split_1 Dim FirstName As String, SR As Range Set SR = [Sheet1!A2:A11] k = 1 For Each i In Range(SR(1), SR(SR.Count - 1)) k = k + 1 FirstName = Mid(i, 1, InStr(1, i, " ") - 1) For Each j In Range(i(k), SR(SR.Count)) Split_1 = Split(j, " ") For Each h In Split_1 If FirstName = h Then MsgBox FirstName & " (" & i.Address & _ " " & j.Address & ")" End If Next Next Next Bruno End Sub |
#8
|
|||
|
|||
Quote:
|
#9
|
|||
|
|||
Actually this code has a bug. It only checks for the first name and not the last name. If you change a first name to Jacob the code will show that there is a duplicate when in fact there is not. Also I am not sure if msgboxes are the most efficient way to let the user know what duplicates to remove. I am curious about the solution to this myself so I am working on a sub that should do it. However since Bruno spent some time on providing a "in the works" solution I would like to see if he wants to fix the issues before I submit my code.
Thanks |
#10
|
|||
|
|||
Ok it has been over an hour with no reply so I am submitting the code that I wrote for this. This code will place a result in column B next to the value in Column A.
I have written 2 codes. One that will identify every duplicate to where you could sort and filter to identify which ones to remove and then another code that will identify just the duplicates. Meaning that if cell A3 has Jacob Parkar and cell A5 and A6 has Jacob Parkar or Parkar Jacob then only cells A5 and A6 will be identifed as duplicates. The only issue with this is that you may want the name with the last name starting as the original so play around with each and see which one works best. As with all code be sure to back up your workbook and save before running. Code:
Sub CheckAllDuplicates() 'Checks each name to see if there are duplicates Dim ws As Worksheet, CheckRow As Long, FirstName As String, LastName As String Dim LastRow As Long, NameString As String, CheckName As String, CheckString As String Dim Counter As Integer 'Set the references Set ws = ThisWorkbook.Worksheets(1) LastRow = ws.Range("A2").End(xlDown).Row For CheckRow = 2 To LastRow CheckString = ws.Range("A" & CheckRow).Value CheckName = "<" & CheckString & ">" FirstName = Mid(CheckString, 1, InStr(1, CheckString, " ") - 1) LastName = Mid(CheckString, Len(FirstName) + 2) 'NameString = NameString & "<" & FirstName & " " & LastName & ">" NameString = NameString & "<" & LastName & " " & FirstName & ">" Next CheckRow For CheckRow = 2 To LastRow CheckString = ws.Range("A" & CheckRow).Value CheckName = "<" & CheckString & ">" Counter = 0 Counter = WorksheetFunction.CountIf(ws.Range("A2" & ":A" & LastRow), CheckString) If InStr(1, NameString, CheckName) Then Counter = Counter + 1 End If If Counter <> 1 Then Range("B" & CheckRow).Value = "Duplicate" End If Next CheckRow End Sub Code:
Sub CountNameDuplicates() 'Checks each name to see if there are duplicates Dim ws As Worksheet, CheckRow As Long, FirstName As String, LastName As String Dim LastRow As Long, NameArray() As String, CheckName As String, CheckString As String Dim Counter As Integer, arr As Long 'Set the references Set ws = ThisWorkbook.Worksheets(1) LastRow = ws.Range("A2").End(xlDown).Row ReDim NameArray(2 To LastRow) As String For CheckRow = 2 To LastRow CheckString = ws.Range("A" & CheckRow).Value CheckName = "<" & CheckString & ">" FirstName = Mid(CheckString, 1, InStr(1, CheckString, " ") - 1) LastName = Mid(CheckString, Len(FirstName) + 2) 'NameString = NameString & "<" & FirstName & " " & LastName & ">" NameArray(CheckRow) = "<" & LastName & " " & FirstName & ">" Next CheckRow For CheckRow = 2 To LastRow CheckString = ws.Range("A" & CheckRow).Value CheckName = "<" & CheckString & ">" Counter = 0 Counter = WorksheetFunction.CountIf(ws.Range("A2" & ":A" & CheckRow), CheckString) 'Check the array For arr = 2 To CheckRow If NameArray(arr) = CheckName Then Counter = Counter + 1 Exit For 'no need to grab more End If Next arr 'Check If Counter <> 1 Then Range("B" & CheckRow).Value = "Duplicate" End If Next CheckRow End Sub To Bruno, I appreciate the code that you wrote and effort you put forth. Charles and I were just unaware that you were going to provide a solution. I have had a time where I posted something without looking at the replies and caused unnecessary work for the person that was already working on it. Please let me know if you have any concerns regarding my comments. Thanks so much. |
#11
|
|||
|
|||
No concerns at all!
I just noted the bug you mentioned and the non-ability of my code to discover names like: Juan Carlos Primero -- Juan Primero Carlos aa bb cc dd ee ff gg -- ff gg cc aa bb ee dd and so on. Then this new code: ========================= Public Sub Test() Dim k As Long, Split_1, Split_2 Dim h, i, j, n, SR As Range Set SR = [Sheet1!A2:A26] k = 1 For Each i In Range(SR(1), SR(SR.Count - 1)) k = k + 1 Split_1 = Split(i, " ") For Each j In Range(i(k), SR(SR.Count)) Split_2 = Split(j, " ") If UBound(Split_1) = UBound(Split_2) Then k = 0 For Each h In Split_1 For Each n In Split_2 If h = n Then k = k + 1 End If Next If k = UBound(Split_1) + 1 Then MsgBox i & " " & j & vbCrLf & _ i.Address & " " & j.Address End If Next End If Next k = 1 Next End Sub ====================== Your opinion would be much appreciated Bruno |
#12
|
|||
|
|||
If you want to immediately delete duplicate (record or entire row):
============================= Public Sub Test_1() Dim k As Long, Split_1, Split_2 Dim h, i, j, n, SR As Range Set SR = [Sheet1!A2:A29] k = 1 For Each i In Range(SR(1), SR(SR.Count - 1)) k = k + 1 Split_1 = Split(i, " ") For Each j In Range(i(k), SR(SR.Count)) Split_2 = Split(j, " ") If UBound(Split_1) = UBound(Split_2) Then k = 0 For Each h In Split_1 For Each n In Split_2 If h = n Then k = k + 1 End If Next If k = UBound(Split_1) + 1 Then 'MsgBox i & " " & j & vbCrLf & _ i.Address & " " & j.Address j(1, 1) = "*" & j.Address End If Next End If Next k = 1 Next For k = SR.Count To 1 Step -1 If Left(SR(k), 1) = "*" Then ' Delete Record SR(k).Delete Shift:=xlUp ' Delete Entire Row 'SR(k).EntireRow.Delete Shift:=xlUp End If Next End Sub =============================== Bruno |
#13
|
|||
|
|||
Quote:
I absolutely love how you used the Split function to extract out the first and last name. Brilliant I of course used 2 mids but it would have been much easier to use your method. The first code fixed the issue it displayed a msgbox identifying the duplicates. The next code went a step further by deleting the entry right out and not actually deleting the entire row (Good call on that) the only issue I ever have with code deleting data is if there was some reason the user needed to validate what the code did. I am confident that your code works but users may want to see the data to delete before they actually do so. I learned a good amount from your code and the techniques you did. The last bit of feed back is to place any code in the code brackets. use [ then "code" then another ] without the quotes then to end your code tag use [ then /code then antoher ] This is very very important as it will maintain your indentation and make things more consistent. Keep up the great work. Thanks |
#14
|
|||
|
|||
Hello Folks,
Both codes are working perfectly fine. Thank you so much for such a quick turn around. Once again thanks a ton to both excelledsoftware and Bruno Campanini for your help. Have a great day ahead guys. |
#15
|
|||
|
|||
excelledsoftware, Bruno
Nice bit of code. I was wanting to see if there was data in another column that would indicate if the names were duplicated. I'm still not too old to learn from others, |
Tags |
duplicate entries |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Extract duplicates in table | goran.c | Excel | 0 | 01-21-2015 12:47 AM |
Adding duplicates as comments | mnut | Excel Programming | 1 | 10-17-2014 10:05 PM |
Mail duplicates | mixy | Outlook | 0 | 02-10-2011 12:54 AM |
Inbox and subfolder duplicates | JoJo | Outlook | 1 | 01-28-2011 04:39 PM |
sum of duplicates | zxmax | Excel | 1 | 09-29-2006 08:29 PM |