Home |
Search |
Today's Posts |
#1
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
There's got to be a faster way to do this, so I come to the brain trust asking for brainy guidance!
I am comparing two ranges, each with appx 7000 items in them. I want to see if an item in list1 is also in list2. If it is, i want to hide that row. Right now, I am going through list1, item by item, and comparing it to list2 item by item via a double loop to achieve this. The problem is that this results in appx 40,320,350 individual iterations of my loop, thus slowing things down considerably. Even with screen updating turned off this is taking quite a bit of time. Is there a way to make this run faster? below is my code: Sub compare() Application.ScreenUpdating = False Dim todaycell As Range Dim yesterdaycell As Range For i = 1 To lastrow(Sheet9) Set yesterdaycell = Sheet9.Range("a" & i) For ii = 1 To lastrow(Sheet10) Set todaycell = Sheet10.Range("a" & ii) If yesterdaycell.Value = todaycell.Value Then yesterdaycell.EntireRow.Hidden = True End If Next ii Next i Application.ScreenUpdating = True End Sub |
#2
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
"Matthew Dyer" wrote:
I am comparing two ranges, each with appx 7000 items in them. [....] The problem is that this results in appx 40,320,350 individual iterations of my loop Is there a way to make this run faster? Several. First and foremost: copy the values of each range into variant arrays. Also note the GoTo; that avoids continuing to compare with rows that you have already hidden. Thus (untested): Option Explicit Sub compare() Dim todaycells As Variant, yesterdaycells As Variant Dim r as Range Dim n9 as Long, n10 as Long, i as Long, ii as Long Application.ScreenUpdating = False n9 = lastrow(Sheet9) n10 = lastrow(Sheet10) Set r = Sheet9.Range("a1:a" & n9) yesterdaycells = r todaycells = Sheet10.Range("a1:a" & n10) For i = 1 To n9 For ii = 1 To n10 If yesterdaycells(i) = todaycells(ii) Then r(i).EntireRow.Hidden = True GoTo getNext9 End If Next ii getNext9: Next i Application.ScreenUpdating = True End Sub It might also improve speed if you build a Union of rows to hide. But I don't know what and if any size limits might apply. |
#3
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Improvement, replacing GoTo with more-structured Exit For.
Option Explicit Sub compare() Dim todaycells As Variant, yesterdaycells As Variant Dim r as Range Dim n9 as Long, n10 as Long, i as Long, ii as Long Application.ScreenUpdating = False n9 = lastrow(Sheet9) n10 = lastrow(Sheet10) Set r = Sheet9.Range("a1:a" & n9) yesterdaycells = r todaycells = Sheet10.Range("a1:a" & n10) For i = 1 To n9 For ii = 1 To n10 If yesterdaycells(i) = todaycells(ii) Then r(i).EntireRow.Hidden = True Exit For End If Next ii Next i Application.ScreenUpdating = True End Sub |
#4
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On Wednesday, October 24, 2012 8:30:25 AM UTC-7, joeu2004 wrote:
Improvement, replacing GoTo with more-structured Exit For. Option Explicit Sub compare() Dim todaycells As Variant, yesterdaycells As Variant Dim r as Range Dim n9 as Long, n10 as Long, i as Long, ii as Long Application.ScreenUpdating = False n9 = lastrow(Sheet9) n10 = lastrow(Sheet10) Set r = Sheet9.Range("a1:a" & n9) yesterdaycells = r todaycells = Sheet10.Range("a1:a" & n10) For i = 1 To n9 For ii = 1 To n10 If yesterdaycells(i) = todaycells(ii) Then r(i).EntireRow.Hidden = True Exit For End If Next ii Next i Application.ScreenUpdating = True End Sub combining both of your reccomendations I was able to significantly cut down the time by well over half! By setting both today and yesterday ranges in active memory then running my itterations that way it runs much more efficiently! Plus the Exit For line makes sure i'm not running additional iterations for a 'find' that already occured! Thanks for the help! Sub compare() Application.ScreenUpdating = False Dim todaycell As Variant Dim yesterdaycell As Variant Dim todayrange As Range Dim yesterdayrange As Range Set todayrange = Sheet10.Range("a1:a" & lastrow(Sheet10)) Set yesterdayrange = Sheet9.Range("a1:a" & lastrow(Sheet9)) For i = 1 To lastrow(Sheet9) Set yesterdaycell = yesterdayrange(i) For ii = 1 To lastrow(Sheet10) Set todaycell = todayrange(ii) If yesterdaycell = todaycell Then yesterdayrange(i).EntireRow.Hidden = True Exit For End If Next ii Next i Application.ScreenUpdating = True End Sub |
#5
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
"Matthew Dyer" wrote:
combining both of your reccomendations I was able to significantly cut down the time by well over half! By setting both today and yesterday ranges in active memory then running [....] Set todayrange = Sheet10.Range("a1:a" & lastrow(Sheet10)) Set yesterdayrange = Sheet9.Range("a1:a" & lastrow(Sheet9)) For i = 1 To lastrow(Sheet9) Set yesterdaycell = yesterdayrange(i) For ii = 1 To lastrow(Sheet10) Set todaycell = todayrange(ii) If yesterdaycell = todaycell Then yesterdayrange(i).EntireRow.Hidden = True Exit For End If Next ii Next i You can do much better. The savings by half is probably just due to the Exit For. You missed a key point of my suggestion, namely: to copy the values of each range into variant arrays. By use ``Set rangeVar = range``, you are not bring each into "active memory". Instead, you are merely setting up a "pointer", very much like what a variable __name__ is to its __value__. The point is: with your implementation, each time you reference todayrange(ii), VBA must ask Excel for the value. At a mimimum, you are making about 14,000 such requests -- 7000 for each range. But in fact, you are making up to about 42 million such requests because you might reference the same todayrange(ii) for each of 7000 iterations through yesterdayrange. These "requests" are much more expensive than merely indexing into an array variable. Excel and VBA are separate threads, almost like separate processes. Moving data between the two requires a lot of system overhead. By comparison, what I wrote requires only two such requests (maybe three), one for each range. (The additional requests to hide rows is the same in both cases.) (It is unclear whether each Set statement results in communication between VBA and Excel. I think it does.) Implement my suggestion exactly as I wrote it. But I forgot to include timing code so that you can accurately measure the performance. I also added .Value qualifiers; they are unnecessary in the context, but it might make it clearer to you what the code is doing. To wit: Option Explicit Sub compare() Dim todaycells As Variant, yesterdaycells As Variant Dim r as Range Dim n9 as Long, n10 as Long, i as Long, ii as Long Dim st as Single, et as Single Application.ScreenUpdating = False st = Timer n9 = lastrow(Sheet9) n10 = lastrow(Sheet10) Set r = Sheet9.Range("a1:a" & n9) yesterdaycells = r.Value todaycells = Sheet10.Range("a1:a" & n10).Value For i = 1 To n9 For ii = 1 To n10 If yesterdaycells(i) = todaycells(ii) Then r(i).EntireRow.Hidden = True Exit For End If Next ii Next i et = Timer Application.ScreenUpdating = True MsgBox Format(et - st, "0.000") & " sec" End Sub |
#6
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On Wednesday, October 24, 2012 10:23:36 AM UTC-7, joeu2004 wrote:
"Matthew Dyer" wrote: combining both of your reccomendations I was able to significantly cut down the time by well over half! By setting both today and yesterday ranges in active memory then running [....] Set todayrange = Sheet10.Range("a1:a" & lastrow(Sheet10)) Set yesterdayrange = Sheet9.Range("a1:a" & lastrow(Sheet9)) For i = 1 To lastrow(Sheet9) Set yesterdaycell = yesterdayrange(i) For ii = 1 To lastrow(Sheet10) Set todaycell = todayrange(ii) If yesterdaycell = todaycell Then yesterdayrange(i).EntireRow.Hidden = True Exit For End If Next ii Next i You can do much better. The savings by half is probably just due to the Exit For. You missed a key point of my suggestion, namely: to copy the values of each range into variant arrays. By use ``Set rangeVar = range``, you are not bring each into "active memory".. Instead, you are merely setting up a "pointer", very much like what a variable __name__ is to its __value__. The point is: with your implementation, each time you reference todayrange(ii), VBA must ask Excel for the value. At a mimimum, you are making about 14,000 such requests -- 7000 for each range. But in fact, you are making up to about 42 million such requests because you might reference the same todayrange(ii) for each of 7000 iterations through yesterdayrange. These "requests" are much more expensive than merely indexing into an array variable. Excel and VBA are separate threads, almost like separate processes. Moving data between the two requires a lot of system overhead. By comparison, what I wrote requires only two such requests (maybe three), one for each range. (The additional requests to hide rows is the same in both cases.) (It is unclear whether each Set statement results in communication between VBA and Excel. I think it does.) Implement my suggestion exactly as I wrote it. But I forgot to include timing code so that you can accurately measure the performance. I also added .Value qualifiers; they are unnecessary in the context, but it might make it clearer to you what the code is doing. To wit: Option Explicit Sub compare() Dim todaycells As Variant, yesterdaycells As Variant Dim r as Range Dim n9 as Long, n10 as Long, i as Long, ii as Long Dim st as Single, et as Single Application.ScreenUpdating = False st = Timer n9 = lastrow(Sheet9) n10 = lastrow(Sheet10) Set r = Sheet9.Range("a1:a" & n9) yesterdaycells = r.Value todaycells = Sheet10.Range("a1:a" & n10).Value For i = 1 To n9 For ii = 1 To n10 If yesterdaycells(i) = todaycells(ii) Then r(i).EntireRow.Hidden = True Exit For End If Next ii Next i et = Timer Application.ScreenUpdating = True MsgBox Format(et - st, "0.000") & " sec" End Sub I tried copy/paste your code exactly and i get an error at the comparison statement (if x = x). When i said i cut the time by over half, i was under-estimating the time saved. the origional code would have taken appx 405 seconds to run... my version only takes 95 seconds, about 23.5% of the time. With the Exit For statement we also cut the number of comparisons down from 45 mil to 21.5 mil. |
#7
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On Wednesday, October 24, 2012 10:23:36 AM UTC-7, joeu2004 wrote:
"Matthew Dyer" wrote: combining both of your reccomendations I was able to significantly cut down the time by well over half! By setting both today and yesterday ranges in active memory then running [....] Set todayrange = Sheet10.Range("a1:a" & lastrow(Sheet10)) Set yesterdayrange = Sheet9.Range("a1:a" & lastrow(Sheet9)) For i = 1 To lastrow(Sheet9) Set yesterdaycell = yesterdayrange(i) For ii = 1 To lastrow(Sheet10) Set todaycell = todayrange(ii) If yesterdaycell = todaycell Then yesterdayrange(i).EntireRow.Hidden = True Exit For End If Next ii Next i You can do much better. The savings by half is probably just due to the Exit For. You missed a key point of my suggestion, namely: to copy the values of each range into variant arrays. By use ``Set rangeVar = range``, you are not bring each into "active memory". Instead, you are merely setting up a "pointer", very much like what a variable __name__ is to its __value__. The point is: with your implementation, each time you reference todayrange(ii), VBA must ask Excel for the value. At a mimimum, you are making about 14,000 such requests -- 7000 for each range. But in fact, you are making up to about 42 million such requests because you might reference the same todayrange(ii) for each of 7000 iterations through yesterdayrange. These "requests" are much more expensive than merely indexing into an array variable. Excel and VBA are separate threads, almost like separate processes.. Moving data between the two requires a lot of system overhead. By comparison, what I wrote requires only two such requests (maybe three), one for each range. (The additional requests to hide rows is the same in both cases.) (It is unclear whether each Set statement results in communication between VBA and Excel. I think it does.) Implement my suggestion exactly as I wrote it. But I forgot to include timing code so that you can accurately measure the performance. I also added .Value qualifiers; they are unnecessary in the context, but it might make it clearer to you what the code is doing. To wit: Option Explicit Sub compare() Dim todaycells As Variant, yesterdaycells As Variant Dim r as Range Dim n9 as Long, n10 as Long, i as Long, ii as Long Dim st as Single, et as Single Application.ScreenUpdating = False st = Timer n9 = lastrow(Sheet9) n10 = lastrow(Sheet10) Set r = Sheet9..Range("a1:a" & n9) yesterdaycells = r.Value todaycells = Sheet10.Range("a1:a" & n10).Value For i = 1 To n9 For ii = 1 To n10 If yesterdaycells(i) = todaycells(ii) Then r(i).EntireRow.Hidden = True Exit For End If Next ii Next i et = Timer Application.ScreenUpdating = True MsgBox Format(et - st, "0.000") & " sec" End Sub I tried a copy/paste of your code but get an error message at the comparison statement. I think it's because you are setting 'r' range to the entire range of sheet9, then you set yesterdaycells to r.value... that's one of the reasons i didn't use your code explicitly in my version, i saw the usage of the r variable as redundant. in either case, your setting the ranges in vba and then doing the comparisons directly through vba and only directing back to excel in the hide row event was key to getting the compile time down. I was able to get my run time down from over 6 min to only 95 second. thanks again for all your help! |
#8
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
"Matthew Dyer" wrote:
I tried copy/paste your code exactly and i get an error at the comparison statement (if x = x). "Matthew Dyer" wrote: I tried a copy/paste of your code but get an error message at the comparison statement. I think it's because you are setting 'r' range to the entire range of sheet9, then you set yesterdaycells to r.value No. Your first assessment was correct. I had a common typo (for me): missing the column index. When we copy even a column range into a variant array, the result is a 2-dimensional array. Mea culpa! See the correct and now-tested implementation below. "Matthew Dyer" wrote: I was able to get my run time down from over 6 min to only 95 second. thanks again for all your help! You probably can do a __lot__ better. I tested with two ranges of 1000 cells each, resulting in hiding about 10%. With your original algorithm, it took about 15 seconds. With your improved algorithm, it took about 2 seconds. With my algorithm below, it took about 0.3 seconds. Of course, YMMV due to the size of data and number of hidden rows as well as differences between computers. Minor improvements to note a 1. lastrow() is called only __once__ for each range. 2. And again, VBA communicates with Excel to access each cell value only __once__. 3. Declaring integer variables (e.g. i and ii) as type Long. Note that I use Long, not Integer. It makes no performance difference in modern computers. But importantly, it allows for the possibility that ranges exceed the numerical range of type Integer. PS: Setting yesterdaySheet and todaySheet is "good practice"; but it was a convenience for me. You could Sheet9 and Sheet10 in multiple places, just as you did before. There is no performance difference. It only makes it easier to change the "names" of the worksheets, as I needed to do. ----- Option Explicit Sub compare() Dim yesterdaySheet As Worksheet, todaySheet As Worksheet Dim todaycells As Variant, yesterdaycells As Variant Dim r As Range Dim n9 As Long, n10 As Long, i As Long, ii As Long Dim st As Single, et As Single Application.ScreenUpdating = False Set yesterdaySheet = Sheet9 Set todaySheet = Sheet10 st = Timer n9 = lastRow(yesterdaySheet) n10 = lastRow(todaySheet) Set r = yesterdaySheet.Range("a1:a" & n9) yesterdaycells = r.Value todaycells = todaySheet.Range("a1:a" & n10).Value For i = 1 To n9 For ii = 1 To n10 If yesterdaycells(i, 1) = todaycells(ii, 1) Then r(i).EntireRow.Hidden = True Exit For End If Next ii Next i et = Timer Application.ScreenUpdating = True MsgBox Format(et - st, "0.000") & " sec" End Sub |
#9
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
wow... down to 5 seconds for a total of 5.125 sec for 21,595,420 individual comparisons. i think that's a bit more of an improvement! Thanks Again!
|
#10
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
If you don't mind, could you describe the purpose of r in your example? Thanks!
|
#11
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
"Matthew Dyer" wrote:
could you describe the purpose of r in your example? For the same reason that you did: Set yesterdayrange = yesterdaySheet.Range("a1:a" & lastRow(yesterdaySheet)) It is more principle than practical. I usually avoid computing things multiple times, especially inside a loop that might executed 7000 times in your case. There are many alternative codings. But I certainly would not call lastrow multiple times unnecessarily. |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Comparing Arrays | Excel Discussion (Misc queries) | |||
Comparing Arrays | Excel Worksheet Functions | |||
Comparing to Arrays | Excel Programming | |||
Comparing to Arrays | Excel Programming | |||
Comparing to Arrays | Excel Programming |