#1
|
|||
|
|||
Slow "comparison/replace" script
Hi everyone, this script is supposed to compare two ranges and in case of no match replace cells with text. Im quite new to VBA and when I usually write a script, it performs poorly on big data sets. Is there any way, how to make this work faster? I have tried Application.ScreenUpdating = False, but it doesnt work.
Thank you in advance! Code:
Sub RangeCompare() Dim Range1 As Range, Range2 As Range, c As Range On Error Resume Next Set Range1 = ActiveWorkbook.Sheets("Master Slide").Range("a4:a90") Set Range2 = Application.InputBox("Select Range2:", Title:="Get Range2", Type:=8) On Error GoTo 0 For Each c In Range2.Cells If Len(c) > 0 Then If Application.WorksheetFunction.CountIf(Range1, c.Value) = 0 Then c.Formula = "Others" End If End If Next c End Sub |
#2
|
|||
|
|||
Hi
Your code isn't that bad, a 100,000 rows are done within 3.0 seconds. There are a few things still you should observe: - Variables which are "Set" to, you should unload once they are of no use anymore - WorksheetFunctions used in vba are slow - Reading and especially writing from/to a worksheet is very slow, it should therefore be done rather in one strike Observing point1 and 2 get's the same result in 2.26 seconds: Code:
Sub RangeCompare2() Dim arrCompare As Variant Dim objdic As Object Dim lngItem As Long Dim Range2 As Range, c As Range Dim dblTimer As Double Set objdic = CreateObject("scripting.dictionary") arrCompare = ActiveWorkbook.Sheets("Master Slide").Range("a4:a90") On Error GoTo TheEnd Set Range2 = Application.InputBox("Select Range2:", Title:="Get Range2", Type:=8) dblTimer = Timer For lngItem = 1 To UBound(arrCompare, 1) If Len(CStr(arrCompare(lngItem, 1))) > 0 Then objdic(arrCompare(lngItem, 1)) = "" End If Next lngItem For Each c In Range2.Cells If Not objdic.Exists(c.Value) Then c.Value = "Others" End If Next c MsgBox Timer - dblTimer TheEnd: Set objdic = Nothing Set Range2 = Nothing End Sub Code:
Sub RangeCompare3() Dim arrCompare As Variant, arrData As Variant Dim objdic As Object Dim lngItem As Long Dim Range2 As Range Dim dblTimer As Double Set objdic = CreateObject("scripting.dictionary") arrCompare = ActiveWorkbook.Sheets("Master Slide").Range("a4:a90") On Error GoTo TheEnd Set Range2 = Application.InputBox("Select Range2:", Title:="Get Range2", Type:=8) dblTimer = Timer For lngItem = 1 To UBound(arrCompare, 1) If Len(CStr(arrCompare(lngItem, 1))) > 0 Then objdic(arrCompare(lngItem, 1)) = "" End If Next lngItem arrData = Range2.Value For lngItem = 1 To UBound(arrData, 1) If Not objdic.Exists(arrData(lngItem, 1)) Then arrData(lngItem, 1) = "Others" End If Next lngItem Range2.Value = arrData MsgBox Timer - dblTimer TheEnd: Set objdic = Nothing Set Range2 = Nothing End Sub |
#3
|
||||
|
||||
Interesting. I never knew about the Application.InputBox method, so I've learned something new. And I don't know why you'd want to do this with a macro instead of just putting the COUNTIF function in the sheet itself. (But clearly you realize you could do that, so I assume you have a reason for wanting to do it this way.)
It isn't clear to me why this would be slow to run. The hard-coded COUNTIF range is fewer than 100 cells, so it should be pretty fast. How many cells are typically in Range2? I would think anything under 250 would be no problem. All that aside, I can see a more complicated but much quicker way to write this. The way you set it up, Excel has to COUNTIF all 87 cells in Range1 for every cell in Range2. The quicker way to do it is to COUNTIF all the values in Range1 once, table the results and then check the table while processing Range2. In fact, you don't even need the count, do you? You just want to know whether a given value appears at all in Range1. I started to show sample code that would do this, but realized that I was assuming that all the values you're checking on are text, not numeric. How you do this depends on whether the values you're checking for are all text, all numeric or may be either. Tell me about that, please. |
#4
|
||||
|
||||
Bob: There really isn't an issue with small datasets - inefficient code will still work quite quickly with those. The efficiency issues become apparent when large datasets are involved. As whatsup's code shows, efficiency gains of up to 10x are possible and, with large datasets, that can make a significant difference.
PS: Although it's good practice, I doubt that cleaning up after the 'Set' variables makes a material difference to the execution speed.
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#5
|
|||
|
|||
Quote:
Memory assigned is restricted to 512 MB no matter of the size of RAM available. In filling up the "pot" it slows down speed. |
#6
|
||||
|
||||
In the middle of the run I can believe it could make a difference. But you did the two Set=Nothings at the very end of the program, right? It isn't obvious to me how that could make even an immeasurable difference. Am I missing something?
|
#7
|
|||
|
|||
Bob, I'm erasing set object variables from memory manually out of two reasons:
- Variables created within a macro should be erased from memory automatically once the macro is done. With set objects in former times it didn't always work this way and you were left to an error of stackoverflow in the worst case or to a poor performance. Since I don't know if the problem is fixed, I just continue erasing them manually. - The main reason: There are different types of objects, for instance classes you must unload manually; if creating a new instance of an object (whenever using the keyword "New") you have to unload it manually as well - even more important because depending of the object it might use huge capacity in memory. Now, tell me are there more? Lazy as I am, I don't want to think about it neither look it up, but destroy each of them manually once they did their job. And it's not a big deal to write the additional instructions. |
#8
|
||||
|
||||
I think you're saying that if I don't free an object manually, there are circumstances where it might not be freed by the VBA editor as the macro comes to an end. I don't think I ever knew that (or if I did, I forgot). That does make a difference, especially because I tend to leave my laptop on for a week or three at a time and Excel is unlikely to be closed at any time during that period. Run macros again and again, and this could get to be a problem.
Well, I don't know that I'll change my habits overnight. But at least I'm sure to be suspicious if things slow down suddenly, in the near future; I may discover that I'm well advised to do the same as you. |
#9
|
||||
|
||||
Slow "comparison/replace" script
@whatsup, its very cool to use the .Exists method of a Dictionary Object as a super efficient search. But I cant seem to see where you populate the Dictionary. I cant see a loop where use use .Add to poke the contents of your array in there. How does that work?
Another point I would add to your solution with regard to speed, is to always use the Value2 property of the Range Object instead of the Value property for bulk read/write. There are overheads associated with type conversions in the Value property which are not present for Value2. The later just blasts the literal bytes across without abstraction. The final point I would make is that there is no need to set local objects to nothing at the end of the sub. This is less efficient than letting VBARuntime do it for you, which it will. It also clutters up your code. VBA uses reference counting to manage object life cycle, so if there are no references to an object then its deallocated from memory and this inherent process is much more efficient than setting objects to Nothing. As a further note, setting a global (Module level) object to nothing does not release it from memory if there are still other objects referencing it. |
#10
|
||||
|
||||
Quote:
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#11
|
||||
|
||||
Well... I think its strange to put some code in, just in case there is a bug in the scripting engine.
Methinks you worry too much? Well... Mebelieves is probably more apropriate. I'm always happy to learn, but I am cautious of mythology... I didn't see any authoritative references offered above in support of this practice. I can find plenty refuting it and even more who are confused about it... And there are plenty of structures where it will do nothing at all because there are references to the object elsewhere. Referring to post #7 there are two reasons given for the practice. Regarding the first, see my opening statement. Regarding the second, it is patently untrue I'm afraid: Custom Class Objects and Objects that are instantiated with a New statement do not need to be explicitly deallocated with a Set to Nothing line. They are managed in the same way as any other object again, by counting referrences. And, depending on their archiatecture, it could give the misleading impression that they have been released when they have not. So, to be honest, I don't accept either as a valid reason. |
#12
|
||||
|
||||
Quote:
http://www.vbi.org/Items/article.asp?id=106 http://www.decisionmodels.com/memlimitsd.htm
__________________
Cheers, Paul Edstein [Fmr MS MVP - Word] |
#13
|
|||
|
|||
Hi
CoolBlue, as to your question: The loop is there, though it's rather a For ... Next: Code:
. . . For lngItem = 1 To UBound(arrCompare, 1) If Len(CStr(arrCompare(lngItem, 1))) > 0 Then objdic(arrCompare(lngItem, 1)) = "" End If Next lngItem . . . Code:
Sub AddCausingError() Dim objdic As Object Set objdic = CreateObject("scripting.dictionary") objdic.Add 1, "" objdic.Add 1, "" Set objdic = Nothing End Sub Code:
Sub AvoidErrorAdd() Dim objdic As Object Set objdic = CreateObject("scripting.dictionary") objdic.Add 1, "" If Not objdic.Exists(1) Then objdic.Add 1, "" Else MsgBox "Key already exists" End If Set objdic = Nothing End Sub Code:
Sub JustOverwrite() Dim objdic As Object Set objdic = CreateObject("scripting.dictionary") objdic(1) = "1stEntry" objdic(1) = "2ndEntry" Debug.Print objdic.Count Debug.Print objdic(1) Set objdic = Nothing End Sub For the other part, well, Paul has provided you by now with some evidence, and there is more out there. I'm pretty aware of other opinions pointing out it's not necessary and I won't interfere if they got the convidence and made up their mind. There are also people believing it's not necessary to declare variables, pointing out that vba comes by default without "Option Explicit". Well, some day they will discover an error and wonder were it comes from ... |
#14
|
||||
|
||||
Quote:
VBA Garbage Collection. @macropod First of all, thanks for the links and just to be clear, I'm not trying to flame anyone here and I won't be making comments like "you didn't look very hard" coz I don't know that for a fact. And for the record, I did a lot of reading before posting. I'm interested in the topic specifically because of the volume and diversity of opinion that I saw. In your first link, they are talking about Access and most of the objects are DAO objects. The example code uses the pattern you advocate but there is no instruction or recommendation for using this pattern. They say that you can free up memory by setting objects to nothing but they do not say that failing to do so, just before a local object goes out if scope, will cause a problem. It was produced by Microsoft but then this guy is also from Microsoft and he's is saying explicitly that it's not necessary and he's no slouch... http://blogs.msdn.com/b/ericlippert/...28/122259.aspx Here's another guy with a solid background who also has a definitive opinion on the issue... http://ramblings.mcpher.com/Home/exc.../scope/garbage If you google VBA garbage collection you will get a lot if hits and I read quite a few if them before my first post on the topic. I found this to be a good representation of the spread of opinion... https://groups.google.com/forum/m/?h...vb/3pw-TGc9PSo The common theme of the two camps is: On the affirmative: I believe I should do it, so I do it, just in case. On the negative: it's not necessary and here is a clear explanation as to why... Your second link is basically supporting what I'm saying: that setting objects to nothing doesn't always do what you expect and that you have to be careful how you construct them and how you tear them down if they have complicated structures. Furthermore, he explicitly says that just setting objects to nothing at the end if a sub is a bad idea. So it's counter to your argument. In your third link there is a list if memory leak sources... Page Setup in Win95 & Win98 & Win ME (GDI problem) - see MSKB Q192869 - NOT fixed by setting objects to Nothing Printing using HP Printers - see MSKB Q165985 and Q218864 - NOT fixed by setting objects to Nothing Querying an open Excel worksheet using ADO - see MSKB Q319998 - NOT fixed by setting objects to Nothing "The memory used by the ADO queries cannot be reclaimed by closing and releasing the ADO objects. The only way to release the memory is to quit Excel." Replacing external link formulae - NOT fixed by setting objects to Nothing Some external data retrieval actions, particularly repeatedly reading text files. Excel seems to store only unique strings rather than multiple occurrences of the same string, but does not seem to remove a unique string when it is no longer in use. - NOT fixed by setting objects to Nothing Controls embedded on worksheets - see MSKB Q238570 - applies to excel97 - NOT fixed by setting objects to Nothing User Defined Functions, particularly when they take input parameters from another sheet and return a string. there are some fixes for these problems in Excel 97 SR2 and Excel 2000 SP2 - see MSKB Q265023 - NOT fixed by setting objects to Nothing Not destroying objects in VBA by setting the object variables to nothing in the correct sequence so that you avoid "orphaned" objects. In theory this is not neccessary, but sometimes problems occur when this is not done. Destroy in inside-out container sequence, for example Range then Worksheet then Workbook. This is similar to your second link and points out the need for proper tear-down of non-trivial custom objects. - NOT fixed by simply setting objects to Nothing prior to them going out of scope and not applicable to local scoped objects. So... I'm just sayin' |
#15
|
||||
|
||||
Quote:
I have worked with Collections in the past and haven't delved into Dictionaries so, thanks for opening my eyes. I love it... you just jam in the Keys by referencing by it and set the member to "". And there they all are. Very nice! Using your method on my system it blasts through 100K lines in 1.6 to 2 sec, depending on the data type. The data I used had about 30% "Others". My little Value2 tweak gives a big advantage to the block reads, but its swamped by savings from your Dictionary jggery-pokery, it depends on the data type: Value: Number or Text 1.6 to 1.7 sec Currency 1.8 to 2 sec; Date 2 sec Value2: Number or Text 1.5 sec; Currency 1.5 sec; Date 1.5 sec And Yep. On the matter of the Set to Nothing, obviously its a personal, value choice between equally respectable options. My aim was to just balance the thread so that people can make their own choice based on both arguments. I actually tried removing your error handler and re-adding the first member to the Dictionary to throw an error, just to see what happened and the object was released, according to VBE anyway... Cheers |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Wierd "script code" in a downloaded .doc file | CNBarnes | Word | 2 | 10-18-2012 02:07 AM |
replace data from variable with "sub and super script" from excel to word by vba | krishnaoptif | Word VBA | 9 | 06-22-2012 05:08 AM |
How to choose a "List" for certain "Heading" from "Modify" tool? | Jamal NUMAN | Word | 2 | 07-03-2011 03:11 AM |
Rules and Alerts: "run a script"? | discountvc | Outlook | 0 | 06-15-2010 07:36 AM |
An "error has occurred in the script on this page" | decann | Outlook | 8 | 09-03-2009 08:54 AM |