Microsoft Office Forums

Go Back   Microsoft Office Forums > >

Reply
 
Thread Tools Display Modes
  #16  
Old 01-30-2025, 03:18 AM
batman1 batman1 is offline Identify matches between sheet columns Windows 11 Identify matches between sheet columns Office 2013
Advanced Beginner
 
Join Date: Jan 2025
Posts: 57
batman1 is on a distinguished road
Default



Quote:
Originally Posted by gmaxey View Post

Your process seems faster but it isn't returning all the required results.

Well, because you didn't write many things. I can't guess your intention. You have to say everything.
1. The case of duplicates in column H in NDS_SHEET In NSD_SHEET H24 = H25 = DemoData_0000015_Import.zip You didn't write how to search, so I did this: When the code finds H24 = "DemoData_0000015_Import.zip" in B229 in REF_SHEET, it no longer searches for H25 in REF_SHEET (that's why I use dictionary DIC) So how do you want it now? Should there be 2 results (for H24 and H25)? <#266912.1><line 24> <#266912.1><line 25> ??? Or maybe something else? If you don't say anything, I won't know anything.



2. The issue of duplicates in column B in REF_SHEET. For example, let's say that "DemoData_0000015_Import.zip" occurs 3 times in column B of REF_SHEET - in B299, B367 and B489. How many results should there be? At the moment, the code will find "DemoData_0000015_Import.zip" in B299 and stop searching, it will not search any further to find in B367 and B489 (Exit For in If pos Then … End If). So what do you want now? The code should search further to B367 and B489?



3. Are we looking for "File Name" from column H of sheet NDS_SHEET in column B of sheet REF_SHEET, or vice versa: will we find for each "File Path" from column B of sheet REF_SHEET its "companion" in column H of sheet NDS_SHEET?


4. Or maybe something else that I don't know about? If you don't say anything, I won't know anything.
Reply With Quote
  #17  
Old 01-30-2025, 04:06 AM
gmaxey gmaxey is offline Identify matches between sheet columns Windows 10 Identify matches between sheet columns Office 2019
Expert
Identify matches between sheet columns
 
Join Date: May 2010
Location: Brasstown, NC
Posts: 1,598
gmaxey is just really nicegmaxey is just really nicegmaxey is just really nicegmaxey is just really nicegmaxey is just really nice
Default

Quote:
Originally Posted by batman1 View Post
Well, because you didn't write many things. I can't guess your intention. You have to say everything.
1. The case of duplicates in column H in NDS_SHEET In NSD_SHEET H24 = H25 = DemoData_0000015_Import.zip You didn't write how to search, so I did this: When the code finds H24 = "DemoData_0000015_Import.zip" in B229 in REF_SHEET, it no longer searches for H25 in REF_SHEET (that's why I use dictionary DIC) So how do you want it now? Should there be 2 results (for H24 and H25)? <#266912.1><line 24> <#266912.1><line 25> ??? Or maybe something else? If you don't say anything, I won't know anything.



2. The issue of duplicates in column B in REF_SHEET. For example, let's say that "DemoData_0000015_Import.zip" occurs 3 times in column B of REF_SHEET - in B299, B367 and B489. How many results should there be? At the moment, the code will find "DemoData_0000015_Import.zip" in B299 and stop searching, it will not search any further to find in B367 and B489 (Exit For in If pos Then … End If). So what do you want now? The code should search further to B367 and B489?



3. Are we looking for "File Name" from column H of sheet NDS_SHEET in column B of sheet REF_SHEET, or vice versa: will we find for each "File Path" from column B of sheet REF_SHEET its "companion" in column H of sheet NDS_SHEET?


4. Or maybe something else that I don't know about? If you don't say anything, I won't know anything.

Batman,
I understand your frustration. As I tried to explain at the very start, it is a complicated process. Complicated and difficult to explain.


The REF_SHEET can contain hundreds of thousands of records. The one I am testing with has 440,000. The NDS_SHEET can have tens of thousands, but for testing I have only 21.


The code I provided in modMain launches a userform interface. With that interface you can create all of the possible outcomes. It works. I am just looking for something that might work faster.


DemoData_0000015_Import.zip You didn't write how to search, so I did this: When the code finds H24 = "DemoData_0000015_Import.zip" in B229 in REF_SHEET, it no longer searches for H25 in REF_SHEET (that's why I use dictionary DIC) So how do you want it now? Should there be 2 results (for H24 and H25)? <#266912.1><line 24> <#266912.1><line 25> ??? Or maybe something else? If you don't say anything, I won't know anything.



That depends on if you select "First Match Only" or "Duplicate Reference Row" in the first case then there should be only one record in the Overlay for #266912.1" in the second case there should be two records.


2. The issue of duplicates in column B in REF_SHEET. For example, let's say that "DemoData_0000015_Import.zip" occurs 3 times in column B of REF_SHEET - in B299, B367 and B489. How many results should there be? At the moment, the code will find "DemoData_0000015_Import.zip" in B299 and stop searching, it will not search any further to find in B367 and B489 (Exit For in If pos Then … End If). So what do you want now? The code should search further to B367 and B489?


Yes. Every record of the REF_SHEET that might have a match in the NDS_SHEET must be processed. That was my initial approach. Loop through every REF_SHEET record ... with my actual data this was taking about a minute.

I changed that initial approach to first loop through the smaller NDS_SHEET to find and create a collection of REF_SHEET Rows that "DO" have a match. I then looped through the Rows in that collection. With the actual data the process now takes about 12 seconds. The options for the REF_SHEET records are:
If No Match Found in NDS_SHEET
1. Discard (don't include) RS_SHEET REF/Control# records in Overlay
2. Copy control ID from RS_SHEET as new record in Overlay
3. Cancel processing - don't create overlay



3. Are we looking for "File Name" from column H of sheet NDS_SHEET in column B of sheet REF_SHEET. Yes. NDS_SHEET column 8 list single file name e.g., Test.png.
REF_SHEET column 2 can list single or mulitple files names e.g., Testing.png//Test.png//This is a test.png. If InStr(REF_SHEET(2),NDS_SHEET(8)> 0 Then
If First Match Option
Record and get out
Else
Record either as new row, or delimited data in existing row.
End If

... will we find for each "File Path" from column B of sheet REF_SHEET its "companion" in column H of sheet NDS_SHEET? It is not really a file path. It can be a single file name or multiple files name delimited with // / or \. No. There are three options if a "companion" is not found in the NDS_SHEET.



All the looping takes time. As I've said, I've gotten down to 12 seconds with the 440,000 REF_SHEET I have. I like your thought process and its possible it might shave off a a few seconds. I thought a power query might get it done in the blink of and eye?? Thank you.
__________________
Greg Maxey
Please visit my web site at http://www.gregmaxey.com/
Reply With Quote
  #18  
Old 01-30-2025, 04:38 AM
batman1 batman1 is offline Identify matches between sheet columns Windows 11 Identify matches between sheet columns Office 2013
Advanced Beginner
 
Join Date: Jan 2025
Posts: 57
batman1 is on a distinguished road
Default

Quote:
Originally Posted by gmaxey View Post

Yes. Every record of the REF_SHEET that might have a match in the NDS_SHEET must be processed.
Well, in point 2 you have to throw away EXIT FOR to search further.
Quote:



3. Are we looking for "File Name" from column H of sheet NDS_SHEET in column B of sheet REF_SHEET. Yes.
Just Yes/No, no explanation needed. Now we need to make changes.
1. Change to
Code:
 ReDim result(1 To UBound(find_data, 1), 1 To 9)
2. Delete "Exit For"


That's all
Attached Files
File Type: xlsm Overlay Tool Sample Data BM.xlsm (89.3 KB, 10 views)
Reply With Quote
  #19  
Old 01-30-2025, 06:02 AM
batman1 batman1 is offline Identify matches between sheet columns Windows 11 Identify matches between sheet columns Office 2013
Advanced Beginner
 
Join Date: Jan 2025
Posts: 57
batman1 is on a distinguished road
Default

At the moment we are taking all duplicates in REF_SHEET.

Still the matter of duplicates in column H in NDS_SHEET. At the moment H24 = H25 = DemoData_0000015_Import.zip. The code only takes H24 ("First Match Only"?) so we add to DIC key = filename = "DemoData_0000015_Import.zip", item = i = 24 (row number where "DemoData_0000015_Import.zip" occurs).

If you want 2 results:
<#266912.1><line 24>
<#266912.1><line 25>
you need to add 2 items to DIC:
key = "DemoData_0000015_Import.zip", item = 24
key = "DemoData_0000015_Import.zip", item = 25

In total we take all duplicates in REF_SHEET and all duplicates in H in NDS_SHEET:
#266912.1 <line 24 in NDS_SHEET>
#266912.1 <line 25 in NDS_SHEET>
#77777.1 <line 24 in NDS_SHEET>
#77777.1 <line 25 in NDS_SHEET>
#77777.2 <line 24 in NDS_SHEET>
#77777.2 <line 25 in NDS_SHEET>

So 6 results (3*2) for "DemoData_0000015_Import.zip": 3 duplicates in REF_SHEET and 2 duplicates in NDS_SHEET

Code for this case:
Code:
Option Explicit

Sub demo()
Dim lastRow As Long, pos As Long, i As Long, j As Long, k As Long, count As Long, curr_row As Long, filename As String, key, item, data(), find_data(), result(), dic As Object
    With Worksheets("OVERLAY")
'        cleaning old results
        lastRow = .Cells(Rows.count, "A").End(xlUp).Row
        If lastRow > 1 Then .Range("A2").Resize(lastRow - 1, 9).ClearContents
    End With
    
    With Worksheets("NDS_SHEET")
        lastRow = .Cells(Rows.count, "A").End(xlUp).Row
        If lastRow = 4 Then Exit Sub    ' there is no data in NDS_SHEET
        data = .Range("A5").Resize(lastRow - 4, 8).value    ' A:H to data
    End With
    
    With Worksheets("REF_SHEET")
        lastRow = .Cells(Rows.count, "A").End(xlUp).Row
        If lastRow = 1 Then Exit Sub    ' there is no data in REF_SHEET
        find_data = .Range("A2").Resize(lastRow - 1, 2).value
    End With
    
    Set dic = CreateObject("Scripting.Dictionary")
    dic.CompareMode = vbTextCompare
    For i = 1 To UBound(data, 1)
        filename = data(i, 8)  ' column 8 sheet NDS_SHEET
        If Not dic.Exists(filename) Then
            ReDim item(1 To 1)
'            first duplicate row number
            item(1) = i
            dic.Add filename, item
        Else
            item = dic.item(filename)
            ReDim Preserve item(1 To UBound(item) + 1)
'            next duplicate row number
            item(UBound(item)) = i
            dic.item(filename) = item
        End If
    Next i
'    in case of and duplicates in NDS_SHEET and duplicates in REF_SHEET the result array is the largest
    ReDim result(1 To UBound(find_data, 1) * UBound(data, 1), 1 To 9)
    
    For Each key In dic.keys
        item = dic.item(key)    ' array of duplicate row numbers in column H sheet NDS_SHEET
        For i = 1 To UBound(find_data, 1)
            pos = InStr(1, find_data(i, 2), key, vbTextCompare)
            If pos Then
                For k = 1 To UBound(item)
                    count = count + 1
                    result(count, 1) = find_data(i, 1)
                    curr_row = item(k)    ' line number in data() with given filename
                    For j = 1 To 8
                        result(count, j + 1) = data(curr_row, j)
                    Next j
                Next k
            End If
        Next i
    Next key
    
    If count Then Worksheets("OVERLAY").Range("A2").Resize(count, 9).value = result
    
    Set dic = Nothing
End Sub
Reply With Quote
  #20  
Old 01-30-2025, 09:28 AM
gmaxey gmaxey is offline Identify matches between sheet columns Windows 10 Identify matches between sheet columns Office 2019
Expert
Identify matches between sheet columns
 
Join Date: May 2010
Location: Brasstown, NC
Posts: 1,598
gmaxey is just really nicegmaxey is just really nicegmaxey is just really nicegmaxey is just really nicegmaxey is just really nice
Default

Batman,


Batman,

I have been off tinkering with the code you sent before this last. Yes, it was returning all the FirstMatch rows (or 9 rows with the our test file) (or 39 rows with my larger test file)

As you have already discovered, to return FirstMatch and duplicate rows then you must cycle every NDS file name for every REF file name(s).

The MyDemo() below was what I have come up with and was testing before your last post. The Sub YourLastModified() below is your last with a modification to either FirstMatch or Duplicate. Both return 9 rows with FirstMatch option on and 12 rows with FirstMatch option off.

You are light years ahead of me with using the Scripting.Dictionary (My hat is off to you). The time difference with our small sample test file is insignificant.

With the larger file 440,000 REF_SHEET rows. Your method is about 0.4 seconds faster. However either method is about 4 seconds faster than what I currently have coded in the userform.

The only advantage with Sub MyDemo() is that the arrangement of the REF/Control# in the Overlay match those in the REF. That can easily be fixed. So no, worries.

Now, there is much more “mud” in the mix, so my next step is to see if I can accommodate the other Dup row options (I think I can).

Another chap is off looking at using PowerQuery, to achieve the desired result. Will just have to wait as see what he comes up with.

In any case this has been a tremendous learning experience for me and I thank you very much for for your time and interest.



Code:
Option Explicit
Sub MyDemo()
Dim lastRow As Long, pos As Long, i As Long, j As Long, count As Long, curr_row As Long, filename As String, key, item, data(), find_data(), result(), dic As Object
Dim arrItems
Dim lngIndex As Long
Dim bMatchFirst As Boolean
Dim Start
  Start = Timer
  bMatchFirst = False
  With Worksheets("OVERLAY")
    'cleaning old results
    lastRow = .Cells(Rows.count, "A").End(xlUp).Row
    If lastRow > 1 Then .Range("A2").Resize(lastRow - 1, 9).ClearContents
  End With
  With Worksheets("NDS_SHEET")
    lastRow = .Cells(Rows.count, "A").End(xlUp).Row
    If lastRow = 4 Then Exit Sub    ' there is no data in NDS_SHEET
    data = .Range("A5").Resize(lastRow - 4, 8).value    ' A:H to data
  End With
  With Worksheets("REF_SHEET")
    lastRow = .Cells(Rows.count, "A").End(xlUp).Row
    If lastRow = 1 Then Exit Sub    ' there is no data in REF_SHEET
    find_data = .Range("A2").Resize(lastRow - 1, 2).value
  End With
  Set dic = CreateObject("Scripting.Dictionary")
  dic.CompareMode = vbTextCompare
  For i = 1 To UBound(data, 1)
    filename = data(i, 8)  ' column 8 sheet NDS_SHEET
    'we only take the first occurrence of a given filename
    'If Not dic.Exists(filename) Then dic.Add filename, i
    'No. We take "ALL" filesnames.
    dic.Add i, filename
  Next i
  arrItems = dic.Items
  ReDim result(1 To UBound(find_data, 1), 1 To 9)
  For i = 1 To UBound(find_data, 1)
    For lngIndex = 0 To UBound(arrItems)
    'For Each key In dic.Keys
       'pos = InStr(1, find_data(i, 2), key, vbTextCompare)
       pos = InStr(1, find_data(i, 2), arrItems(lngIndex), vbTextCompare)
       If pos Then
         count = count + 1
         result(count, 1) = find_data(i, 1)
         'curr_row = dic.Keys(i).Value    ' line number in data() with given filename
         curr_row = lngIndex + 1 'dic.Items(lngIndex).Value
         For j = 1 To 8
           result(count, j + 1) = data(curr_row, j)
         Next j
         If bMatchFirst Then Exit For
       End If
    Next lngIndex
  Next i
  If count Then Worksheets("OVERLAY").Range("A2").Resize(count, 9).value = result
  Set dic = Nothing
  MsgBox Timer - Start
End Sub

Sub YourLastModified()
Dim Start
  Start = Timer

Dim lastRow As Long, pos As Long, i As Long, j As Long, k As Long, count As Long, curr_row As Long, filename As String, key, item, data(), find_data(), result(), dic As Object
Dim bFirstMatch As Boolean
  bFirstMatch = False
    With Worksheets("OVERLAY")
'        cleaning old results
        lastRow = .Cells(Rows.count, "A").End(xlUp).Row
        If lastRow > 1 Then .Range("A2").Resize(lastRow - 1, 9).ClearContents
    End With
    
    With Worksheets("NDS_SHEET")
        lastRow = .Cells(Rows.count, "A").End(xlUp).Row
        If lastRow = 4 Then Exit Sub    ' there is no data in NDS_SHEET
        data = .Range("A5").Resize(lastRow - 4, 8).value    ' A:H to data
    End With
    
    With Worksheets("REF_SHEET")
        lastRow = .Cells(Rows.count, "A").End(xlUp).Row
        If lastRow = 1 Then Exit Sub    ' there is no data in REF_SHEET
        find_data = .Range("A2").Resize(lastRow - 1, 2).value
    End With
    
    Set dic = CreateObject("Scripting.Dictionary")
    dic.CompareMode = vbTextCompare
    For i = 1 To UBound(data, 1)
        filename = data(i, 8)  ' column 8 sheet NDS_SHEET
        If Not dic.Exists(filename) Then
            ReDim item(1 To 1)
'            first duplicate row number
            item(1) = i
            dic.Add filename, item
        Else
            item = dic.item(filename)
            ReDim Preserve item(1 To UBound(item) + 1)
'            next duplicate row number
            item(UBound(item)) = i
            dic.item(filename) = item
        End If
    Next i
'    in case of and duplicates in NDS_SHEET and duplicates in REF_SHEET the result array is the largest
    ReDim result(1 To UBound(find_data, 1) * UBound(data, 1), 1 To 9)
    For Each key In dic.Keys
        item = dic.item(key)    ' array of duplicate row numbers in column H sheet NDS_SHEET
        For i = 1 To UBound(find_data, 1)
            pos = InStr(1, find_data(i, 2), key, vbTextCompare)
            If pos Then
                For k = 1 To UBound(item)
                  'If k > 1 Then MsgBox item(k)
                  count = count + 1
                  result(count, 1) = find_data(i, 1)
                  curr_row = item(k)    ' line number in data() with given filename
                  For j = 1 To 8
                    result(count, j + 1) = data(curr_row, j)
                 Next j
                 If bFirstMatch Then Exit For
               Next k
            End If
        Next i
    Next key
    
    If count Then Worksheets("OVERLAY").Range("A2").Resize(count, 9).value = result
    
    Set dic = Nothing
    MsgBox Timer - Start
End Sub
__________________
Greg Maxey
Please visit my web site at http://www.gregmaxey.com/
Reply With Quote
  #21  
Old 01-30-2025, 11:18 AM
batman1 batman1 is offline Identify matches between sheet columns Windows 11 Identify matches between sheet columns Office 2013
Advanced Beginner
 
Join Date: Jan 2025
Posts: 57
batman1 is on a distinguished road
Default

Quote:
Originally Posted by gmaxey View Post
Batman,
So, Solved already? Good luck
Reply With Quote
  #22  
Old 01-30-2025, 01:03 PM
gmaxey gmaxey is offline Identify matches between sheet columns Windows 10 Identify matches between sheet columns Office 2019
Expert
Identify matches between sheet columns
 
Join Date: May 2010
Location: Brasstown, NC
Posts: 1,598
gmaxey is just really nicegmaxey is just really nicegmaxey is just really nicegmaxey is just really nicegmaxey is just really nice
Default

Well not solved. Moved along certainly. Want to leave the thread open in case someone else offers a different approach.
__________________
Greg Maxey
Please visit my web site at http://www.gregmaxey.com/
Reply With Quote
  #23  
Old 01-30-2025, 05:53 PM
gmaxey gmaxey is offline Identify matches between sheet columns Windows 10 Identify matches between sheet columns Office 2019
Expert
Identify matches between sheet columns
 
Join Date: May 2010
Location: Brasstown, NC
Posts: 1,598
gmaxey is just really nicegmaxey is just really nicegmaxey is just really nicegmaxey is just really nicegmaxey is just really nice
Default

Quote:
Originally Posted by batman1 View Post
So, Solved already? Good luck

Batman,


After working with this code a bit more, I am going to mark this thread solved. I found that I didn't really need to use a dictionary at all and I believe the real efficiency gain was writing data to the arrays vice directly in the worksheet. The following code does 95% on what I need and with a 440,000 row REF sheet, it does it in under 9 seconds. That is a big improvement and I think with a little more work I can call the cannon ball fully polished. If someone comes along in this tread or the crosspost with a PowerQuery solution, I'll be eager to look at it but you have been a tremendous help. Thank you again!!


Code:
Option Explicit
Sub MyModified()
Dim oOS As Worksheet
Dim strColHeadings As String
Dim varNDS_Data(), varREF_Data(), varOverlay_Data()
Dim lngST As Long, lngIndex As Long, lngFldIndex As Long, lngItemIndex As Long, lngRecordIndex As Long, lngNDSRowIndex As Long
Dim lngOSRow As Long, lngOSCol As Long
Dim strFileName As String
Dim bMatched As Boolean, bNoMatch As Boolean, bNoREFMatchCanx As Boolean, bNoREFMatchCopyID As Boolean, bFirstMatch As Boolean
Dim bDupRow As Boolean, bConcatenateDupSC As Boolean, bConcatenateDupLF As Boolean, bNoMultiMatches As Boolean
Dim bDelim As Boolean, strDelimiter As String
  lngST = Timer
  strColHeadings = "REF/Contol#|Ser. Nb|Document Type|Document Date|Classification|Title|Description|Has Attachments|File Name"
  bMatched = False
  'NDS_Options Note: Only one on the next line shoul be true
  bFirstMatch = True: bDupRow = False: bConcatenateDupSC = False: bConcatenateDupLF = False: bNoMultiMatches = False
  'REF_Options Note: Normally both on next line are Fallse.  Only one on the next line can be true
  bNoREFMatchCopyID = False: bNoREFMatchCanx = False
  strDelimiter = ";": If bConcatenateDupLF = True Then strDelimiter = vbCrLf
  On Error Resume Next
  Set oOS = Worksheets("OVERLAY")
  If Err.Number = 0 Then
    Application.DisplayAlerts = False
    oOS.Delete
    Application.DisplayAlerts = True
  End If
  On Error GoTo 0
  Set oOS = Worksheets.Add
  oOS.Name = "OVERLAY"
  DoEvents
  With Worksheets("NDS_SHEET")
    lngIndex = .Cells(Rows.count, "A").End(xlUp).Row
    If lngIndex = 4 Then Exit Sub 'There is no data in NDS_SHEET
    varNDS_Data = .Range("A5").Resize(lngIndex - 4, 8).Value
  End With
  With Worksheets("REF_SHEET")
    lngIndex = .Cells(Rows.count, "A").End(xlUp).Row
    If lngIndex = 1 Then Exit Sub 'There is no data in REF_SHEET
    varREF_Data = .Range("A2").Resize(lngIndex - 1, 2).Value
  End With
  ReDim varOverlay_Data(1 To UBound(varREF_Data, 1), 1 To 9)
  For lngIndex = 1 To UBound(varREF_Data, 1)
    bNoMatch = True
    bMatched = False
    For lngItemIndex = 1 To UBound(varNDS_Data, 1)
      If InStr(1, varREF_Data(lngIndex, 2), varNDS_Data(lngItemIndex, 8), vbTextCompare) > 0 Then
        bNoMatch = False
        If Not bMatched Then
          lngRecordIndex = lngRecordIndex + 1
          varOverlay_Data(lngRecordIndex, 1) = varREF_Data(lngIndex, 1)
          lngNDSRowIndex = lngItemIndex + 1
          For lngFldIndex = 1 To 8
            varOverlay_Data(lngRecordIndex, lngFldIndex + 1) = varNDS_Data(lngNDSRowIndex, lngFldIndex)
          Next lngFldIndex
          If bFirstMatch Then Exit For
          bMatched = True
        Else
          If Not bNoMultiMatches Then
            If bDupRow Then
              lngRecordIndex = lngRecordIndex + 1
              varOverlay_Data(lngRecordIndex, 1) = varREF_Data(lngIndex, 1)
              lngNDSRowIndex = lngItemIndex + 1
              For lngFldIndex = 1 To 8
                varOverlay_Data(lngRecordIndex, lngFldIndex + 1) = varNDS_Data(lngNDSRowIndex, lngFldIndex)
              Next lngFldIndex
            Else
              For lngFldIndex = 1 To 8
                varOverlay_Data(lngRecordIndex, lngFldIndex + 1) = varOverlay_Data(lngRecordIndex, lngFldIndex + 1) & strDelimiter & varNDS_Data(lngNDSRowIndex, lngFldIndex)
              Next lngFldIndex
            End If
          Else
            MsgBox "Overlay function canceled due to multiple match on REF_SHEET row: " & lngIndex + 1
            GoTo lbl_Exit
          End If
        End If
      End If
   Next lngItemIndex
   Select Case True
     Case bNoREFMatchCopyID And bNoMatch
       lngRecordIndex = lngRecordIndex + 1
       'varOverlay_Data(lngRecordIndex, 1) = varREF_Data(lngIndex, 1)
     Case bNoREFMatchCanx And bNoMatch
       MsgBox "Overlay function canceled due to no NDS match for REF_SHEET row: " & lngIndex
       Exit For
   End Select
  Next lngIndex
  If lngRecordIndex Then oOS.Range("A2").Resize(lngRecordIndex, 9).Value = varOverlay_Data
  With oOS
     .Range("A1").Resize(1, 9).Value = Split(strColHeadings, "|")
    DoEvents
    .Rows(1).Font.Bold = True
    .Rows(1).AutoFilter
    .Rows(1).Select
    .Application.ActiveWindow.SplitColumn = 1
    .Application.ActiveWindow.SplitRow = 1
    .Application.ActiveWindow.FreezePanes = True
    DoEvents
    With .UsedRange
      .WrapText = False
      .EntireColumn.AutoFit
      If .ColumnWidth > 60 Then .ColumnWidth = 60
    End With
    DoEvents
    lngOSRow = 2
    Do While Len(.Cells(lngOSRow, 2).Value) > 0
      lngOSCol = 2
      Do While Len(.Cells(1, lngOSCol).Value) > 0
        .Cells(lngOSRow, lngOSCol).FormulaR1C1 = .Cells(lngOSRow, lngOSCol).Value
        lngOSCol = lngOSCol + 1
      Loop
      lngOSRow = lngOSRow + 1
    Loop
    .UsedRange.Columns.AutoFit
    .UsedRange.Rows.AutoFit
    DoEvents
    If bDupRow Then
      With .Columns(1)
        .FormatConditions.AddUniqueValues
        .FormatConditions(.FormatConditions.count).SetFirstPriority
        With .FormatConditions(1)
          .DupeUnique = xlDuplicate
          .Font.Color = -16383844
          .Font.TintAndShade = 0
          .Interior.PatternColorIndex = xlAutomatic
          .Interior.Color = 13551615
          .Interior.TintAndShade = 0
          .StopIfTrue = False
        End With
      End With
    End If
  End With
  MsgBox Timer - lngST
lbl_Exit:
  Set oOS = Nothing
  Exit Sub
lbl_Canx:
  Application.DisplayAlerts = False
  oOS.Delete
  Application.DisplayAlerts = True
  GoTo lbl_Exit
End Sub
__________________
Greg Maxey
Please visit my web site at http://www.gregmaxey.com/
Reply With Quote
  #24  
Old 01-31-2025, 04:53 AM
batman1 batman1 is offline Identify matches between sheet columns Windows 11 Identify matches between sheet columns Office 2013
Advanced Beginner
 
Join Date: Jan 2025
Posts: 57
batman1 is on a distinguished road
Default

Quote:
Originally Posted by gmaxey View Post
Batman,


After working with this code a bit more, I am going to mark this thread solved. I found that I didn't really need to use a dictionary at all and I believe the real efficiency gain was writing data to the arrays vice directly in the worksheet. The following code does 95% on what I need and with a 440,000 row REF sheet, it does it in under 9 seconds. That is a big improvement and I think with a little more work I can call the cannon ball fully polished. If someone comes along in this tread or the crosspost with a PowerQuery solution, I'll be eager to look at it but you have been a tremendous help. Thank you again!!

1. Yes, when there is a lot of data, reading/writing to the sheet should not be done cell by cell. Data should be read once into tables, something should be done on this data and once put into the sheet. And a dictionary can be useful. Note that at the beginning I did not know much about what you intended to do, you did not say anything about the data. At the beginning I did not know whether you were taking duplicates in H in the NDS_SHEET sheet but I assumed that I was not taking duplicates. So I am taking only the first one and will not check the next ones. I did not know anything about the data but it is possible that there are e.g. 10 duplicates of "DemoData_0000015_Import.zip". Of course, you can not use the dictionary, but each time you have to assess whether it is better to use the dictionary or not. Let's assume that there are 10 rows in NDS_SHEET and each one contains "DemoData_0000015_Import.zip" and in REF_SHEET 1000 rows and only 1 row contains "DemoData_0000015_Import.zip". There is only 1 result. DIC contains only 1 key and in the worst case, when "DemoData_0000015_Import.zip" is in the last row of REF_SHEET, the code executes 1000 FOR loops (1000 INSTR). If the dictionary is not used, then in the worst case after executing 999*10 = 9990 FOR loops (999 external * 10 internal) - 9990 times of INSTR execution the code will not find a match yet. Only with lngIndex = 1000 and lngItemIndex = 1 will it find a match.

2. I don't understand this

Code:
lngNDSRowIndex = lngItemIndex + 1
…
varOverlay_Data(lngRecordIndex, lngFldIndex + 1) = varNDS_Data(lngNDSRowIndex, lngFldIndex)
In my opinion it should be

Code:
lngNDSRowIndex = lngItemIndex
…
varOverlay_Data(lngRecordIndex, lngFldIndex + 1) = varNDS_Data(lngNDSRowIndex, lngFldIndex)
or more simply

Code:
varOverlay_Data(lngRecordIndex, lngFldIndex + 1) = varNDS_Data(lngItemIndex, lngFldIndex)
Reply With Quote
  #25  
Old 01-31-2025, 05:37 AM
gmaxey gmaxey is offline Identify matches between sheet columns Windows 10 Identify matches between sheet columns Office 2019
Expert
Identify matches between sheet columns
 
Join Date: May 2010
Location: Brasstown, NC
Posts: 1,598
gmaxey is just really nicegmaxey is just really nicegmaxey is just really nicegmaxey is just really nicegmaxey is just really nice
Default

Batman,


Just headed out for a short leisure trip. Haven't fully digested your comments and I will have to look it over closely but I think you are correct about 2. Thank you.
__________________
Greg Maxey
Please visit my web site at http://www.gregmaxey.com/
Reply With Quote
  #26  
Old 01-31-2025, 06:54 AM
batman1 batman1 is offline Identify matches between sheet columns Windows 11 Identify matches between sheet columns Office 2013
Advanced Beginner
 
Join Date: Jan 2025
Posts: 57
batman1 is on a distinguished road
Default

Quote:
Originally Posted by gmaxey View Post
Batman,


Just headed out for a short leisure trip. Haven't fully digested your comments and I will have to look it over closely but I think you are correct about 2. Thank you.
We assume that
1. REF_SHEET has 1000 rows - the first 999 rows contain e.g. "DemoData_A000001.zip", "DemoData_A000002.zip", ..., "DemoData_A000999.zip" - the last row contains "DemoData_0000015_Import.zip"
2. NDS_SHEET has 10 rows, each containing "DemoData_0000015_Import.zip" Therefore, there is only 1 result: <REF/Contol# from the last row of REF>< 8 columns from the first row of NDS>


A. from DIC - 10 FOR loops to add key = "DemoData_0000015_Import.zip", item = 1 - 999 FOR loops and the code will not find "DemoData_0000015_Import.zip" (the only key with DIC). Only loop 1000 will find "DemoData_0000015_Import.zip" on the last line of REF. In total INSTR is executed 1000 times + 10 times "If Not dic.Exists(filename) Then dic.Add filename, and"


B. Without DIC For lngIndex = 1 to 999 and lngItemIndex = 1 to 10 the code executes 999*10 = 9990 times INSTR, and will not find a match. Only with lngIndex = 1000 and lngItemIndex = 1 will it find a match. In total the code executes INSTR 9990 + 1 = 9991 times.
Reply With Quote
  #27  
Old 02-02-2025, 09:37 AM
gmaxey gmaxey is offline Identify matches between sheet columns Windows 10 Identify matches between sheet columns Office 2019
Expert
Identify matches between sheet columns
 
Join Date: May 2010
Location: Brasstown, NC
Posts: 1,598
gmaxey is just really nicegmaxey is just really nicegmaxey is just really nicegmaxey is just really nicegmaxey is just really nice
Default

Batman,


Yes, you were correct. You seem to have a very good handle on dictionary. Perhaps you can explain why this code errors:


Code:
Sub Test()
Dim oDic As Object, oKey
Dim varVals
Dim lngIndex As Long
   varVals = Split("APPLES,Apples,Pears,Pears,PEARS,pears,Blueberries,APPLES", ",")
   Set oDic = CreateObject("Scripting.Dictionary")
   oDic.CompareMode = 1
   For lngIndex = 0 To UBound(varVals)
     If oDic.Exists(varVals(lngIndex)) Then
       oDic.Item(varVals(lngIndex)) = oDic.Item(varVals(lngIndex)) + 1
     Else
       oDic.Add varVals(lngIndex), 1
     End If
   Next
   For Each oKey In oDic.Keys
     MsgBox oKey & " count = " & oDic.Item(oKey)
   Next oKey
   'Why can't set oKey to a spefic key index e.g.:
   oKey = oDic.Keys(1).Value  'Errors here
   MsgBox oKey & " count = " & oDic.Item(oKey)
lbl_Exit:
  Exit Sub
End Sub
Looking at the Debugger Watches windows, oDic.Keys(1) definitely has a value.
Attached Images
File Type: jpg Error.jpg (105.6 KB, 11 views)
__________________
Greg Maxey
Please visit my web site at http://www.gregmaxey.com/
Reply With Quote
  #28  
Old 02-02-2025, 12:19 PM
batman1 batman1 is offline Identify matches between sheet columns Windows 11 Identify matches between sheet columns Office 2013
Advanced Beginner
 
Join Date: Jan 2025
Posts: 57
batman1 is on a distinguished road
Default

Quote:
Originally Posted by gmaxey View Post
Batman,
Perhaps you can explain why this code errors:
1. oKey = oDic.Keys(1).Value

oDic.Add varVals(lngIndex), 1
=> Key = varVals(lngIndex) = STRING
Even if oDic.Keys returns key, key is not an object so it cannot have a property, and it certainly cannot have a VALUE property

2. oKey = oDic.Keys(1) still causes an error. Why?

Take a test and change to…
menu Tools -> References -> select "Microsoft Scripting Runtime"
Code:
Subtest2()
Dim oDic As Scripting.Dictionary, oKey
Dim varVals
Dim lngIndex As Long
 varVals = Split("APPLES,Apples,Pears,Pears,PEARS,pears,Blueberries,APPLES", ",")
 Set oDic = New Scripting.Dictionary
 oDic.CompareMode = 1
 For lngIndex = 0 To UBound(varVals)
 If oDic.Exists(varVals(lngIndex)) Then
 oDic.Item(varVals(lngIndex)) = oDic.Item(varVals(lngIndex)) + 1
 Else
 oDic.Add varVals(lngIndex), 1
 EndIf
Next
For Each oKey In oDic.Keys
MsgBox oKey & " count = " & oDic.Item(oKey)
Next oKey
oKey = oDic.Keys(1)
MsgBox oKey & " count = " & oDic.Item(oKey)
lbl_Exit:
Exit Sub
End Sub
Now there is no error. Why?
---------------
A. My explanation:

VBA is a scripting language, interpreted, not compiled.

In my code, thanks to adding a reference and "early binding", the interpreter already "at the beginning" knows that Keys is a method - a function without a parameter that returns an array, so (1) means the first element of this array. In your code, because of "late binding", the interpreter does not know whether Keys is a method or properties. It's probably looking for a Keys function with one parameter, and there isn't one.

--------------------------------
With "late binding" (your code) you have 2 ways:
1.
Code:
Dim arrKey
arrKey = oDic.Keys ' the interpreter looks for method Keys without a parameter, and it will definitely find
oKey = arrKey(1) ' Keys returns an array, so arrKey is an array and arrKey(1) is the array element at index 1
2.
Code:
oKey = oDic.Keys()(1) ' oDic.Keys() means method without a parameter, and returns an array, so (1) is the array element at index 1
MsgBox oKey & " count = " & oDic.Item(oKey)
I don't know if in the "A. My explanation:" I will explain correctly. In any case, with "late binding" (your code) you have 2 methods given above
Reply With Quote
  #29  
Old 02-02-2025, 01:36 PM
gmaxey gmaxey is offline Identify matches between sheet columns Windows 10 Identify matches between sheet columns Office 2019
Expert
Identify matches between sheet columns
 
Join Date: May 2010
Location: Brasstown, NC
Posts: 1,598
gmaxey is just really nicegmaxey is just really nicegmaxey is just really nicegmaxey is just really nicegmaxey is just really nice
Default

Yes. That all makes sense. There is one final piece to the larger project you have been involved with. I will post it later this evening to see if you have an idea. Thank you.
__________________
Greg Maxey
Please visit my web site at http://www.gregmaxey.com/
Reply With Quote
  #30  
Old 02-02-2025, 01:43 PM
snb snb is offline Identify matches between sheet columns Windows XP Identify matches between sheet columns Office 2010
Novice
 
Join Date: Feb 2025
Posts: 1
snb is on a distinguished road
Default

I'd use:

Code:
Sub M_snb()
   sn = Sheet1.Range("H2:H23")
   sp = [transpose(ref_sheet!B2:B17)]
   
   For j = 1 To UBound(sn)
     c00 = c00 & vbLf & Join(Filter(sp, sn(j, 1)), vbLf)
   Next
   
   y = UBound(Filter(Split(c00, vbLf), ".")) + 1
   MsgBox Join(Filter(Split(c00, vbLf), "."), vbLf), , y
End Sub
Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to Remove the subtotal columns in a sheet Marcia Excel 3 12-01-2023 05:48 AM
Identify matches between sheet columns Relating cells in a row in one sheet to cells in columns on another sheet. mbesspiata3 Excel 2 01-06-2017 05:42 AM
Identify matches between sheet columns Create a New Sheet from Existing Sheet with Specific Columns malam Excel Programming 1 10-17-2014 10:01 PM
Identify matches between sheet columns From an XL sheet ,how to keep the group of columns which match with other XL sheet Zubairkhan Excel 2 03-04-2014 10:57 PM
Removing columns within sheet shabbaranks Excel 2 09-11-2012 05:03 AM

Other Forums: Access Forums

All times are GMT -7. The time now is 05:28 AM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2025, vBulletin Solutions Inc.
Search Engine Optimisation provided by DragonByte SEO (Lite) - vBulletin Mods & Addons Copyright © 2025 DragonByte Technologies Ltd.
MSOfficeForums.com is not affiliated with Microsoft