Home |
Search |
Today's Posts |
|
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
Hello,
Does anyone know a means of comparing two very large arrays (50000 elements) to determine which elements are common to both arrays? I've tried using looping functions and that is far too slow. |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
hi,
i don't know why your looping is far too slow, it should not, can you show your code -- isabelle Le 2011-05-21 11:43, Andrew a écrit : Hello, Does anyone know a means of comparing two very large arrays (50000 elements) to determine which elements are common to both arrays? I've tried using looping functions and that is far too slow. |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
On May 21, 7:43*pm, isabelle wrote:
hi, i don't know why your looping is far too slow, it should not, can you show your code -- isabelle Le 2011-05-21 11:43, Andrew a écrit : Hello, Does anyone know a means of comparing two very large arrays (50000 elements) to determine which elements are common to both arrays? *I've tried using looping functions and that is far too slow. Isabel, Imagine one range (A) with 50,000 rows, 2 columns, a second range with 50,000 rows (B) 1 column. I want to find all the elements in B which have matches in column 1 of A. For each match, I want to copy the corresponding entry in A column 2 over to a new column 2 in B. Example. A= 1 a 2 f 3 u 4 w 5 q B= 5 1 6 8 9 After running the code, B would be transformed to: B= 5 q 1 a 8 9 6 To do this I create the following loops. For k1 = 1 to 50000 dataX=worksheets(1).cells(k1,1) For k2=1 to 50000 dataY= worksheets(2).cells(k2,1) If dataX=dataY then worksheets(1).cells(k1,2)=worksheets(2).cells(k2,2 ) end if next next This code is accurate, but terrribly slow. This loop has to execute 50000^2 loops, 2.5 billion loops. Let's say that each iteration requires 20 CPU clock cycles, that's 50 billion clock cycles. My CPU has a 2.5 GHz processor, so this ideally would take 20 seconds. It takes more than that, about 45 seconds. If I were doing this in Matlab or C, I could process this much data in less than a second, but not because the CPU is faster. There are other programming constructs available which are faster. I know Excel can run faster than this, but I don't know any tricks on how to speed this up. Any ideas? |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
hi Andrew,
with the functions "Index" and "Match" you can reduce the number of loop Code:
Sub Macro1() Dim Wks1 As Worksheet, Wks2 As Worksheet, k2 As Integer Set Wks1 = Worksheets(1) Set Wks2 = Worksheets(2) For k2 = 1 To 5 If Not IsError(Application.Match(Wks2.Cells(k2, 1), Wks1.Range("A:A"), 0)) Then Wks2.Cells(k2, 2) = Application.Index(Wks1.Range("B:B"), Application.Match(Wks2.Cells(k2, 1), Wks1.Range("A:A"), 0)) End If Next Set Wks1 = Nothing Set Wks2 = Nothing End Sub -- isabelle Le 2011-05-22 09:34, Andrew a écrit : On May 21, 7:43 pm, wrote: hi, i don't know why your looping is far too slow, it should not, can you show your code -- isabelle Le 2011-05-21 11:43, Andrew a écrit : Hello, Does anyone know a means of comparing two very large arrays (50000 elements) to determine which elements are common to both arrays? I've tried using looping functions and that is far too slow. Isabel, Imagine one range (A) with 50,000 rows, 2 columns, a second range with 50,000 rows (B) 1 column. I want to find all the elements in B which have matches in column 1 of A. For each match, I want to copy the corresponding entry in A column 2 over to a new column 2 in B. Example. A= 1 a 2 f 3 u 4 w 5 q B= 5 1 6 8 9 After running the code, B would be transformed to: B= 5 q 1 a 8 9 6 To do this I create the following loops. For k1 = 1 to 50000 dataX=worksheets(1).cells(k1,1) For k2=1 to 50000 dataY= worksheets(2).cells(k2,1) If dataX=dataY then worksheets(1).cells(k1,2)=worksheets(2).cells(k2,2 ) end if next next This code is accurate, but terrribly slow. This loop has to execute 50000^2 loops, 2.5 billion loops. Let's say that each iteration requires 20 CPU clock cycles, that's 50 billion clock cycles. My CPU has a 2.5 GHz processor, so this ideally would take 20 seconds. It takes more than that, about 45 seconds. If I were doing this in Matlab or C, I could process this much data in less than a second, but not because the CPU is faster. There are other programming constructs available which are faster. I know Excel can run faster than this, but I don't know any tricks on how to speed this up. Any ideas? |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
On May 22, 7:07*am, isabelle wrote:
Sub Macro1() Dim Wks1 As Worksheet, Wks2 As Worksheet, k2 As Integer Set Wks1 = Worksheets(1) Set Wks2 = Worksheets(2) [....] Set Wks1 = Nothing Set Wks2 = Nothing End Sub Why set Wks1 and Wks2 to Nothing before exiting the macro? Since they are non-static local variables, I would expect VBA to effectively do the same thing automagically. Is there a memory leak in some revisions of VBA? |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
hi joeu2004,
Is there a memory leak ? yes there are in some revisions of VBA? it's valid for all versions of VBA -- isabelle |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
joeu2004 was thinking very hard :
On May 22, 7:07*am, isabelle wrote: Sub Macro1() Dim Wks1 As Worksheet, Wks2 As Worksheet, k2 As Integer Set Wks1 = Worksheets(1) Set Wks2 = Worksheets(2) [....] Set Wks1 = Nothing Set Wks2 = Nothing End Sub Why set Wks1 and Wks2 to Nothing before exiting the macro? Since they are non-static local variables, I would expect VBA to effectively do the same thing automagically. Is there a memory leak in some revisions of VBA? My understanding of how this works is that it's just good programming practice to do this because even though the objects are destroyed when the procedure ends, the memory space that they occupied in the Stack still persists. I may be wrong, but this is how it was explained to me by the leading minds in Excel application development. -- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
Imagine one range (A) with 50,000 rows, 2 columns, a second range with
50,000 rows (B) 1 column. I want to find all the elements in B which have matches in column 1 of A. For each match, I want to copy the corresponding entry in A column 2 over to a new column 2 in B. Example. A= 1 a 2 f 3 u 4 w 5 q B= 5 1 6 8 9 After running the code, B would be transformed to: B= 5 q 1 a 8 9 6 To do this I create the following loops. For k1 = 1 To 50000 dataX = Worksheets(1).Cells(k1, 1) For k2 = 1 To 50000 dataY = Worksheets(2).Cells(k2, 1) If dataX = dataY Then Worksheets(1).Cells(k1, 2) = Worksheets(2).Cells(k2, 2) End If Next Next This code is accurate, but terrribly slow. This loop has to execute 50000^2 loops, 2.5 billion loops. Let's say that each iteration requires 20 CPU clock cycles, that's 50 billion clock cycles. My CPU has a 2.5 GHz processor, so this ideally would take 20 seconds. It takes more than that, about 45 seconds. If I were doing this in Matlab or C, I could process this much data in less than a second, but not because the CPU is faster. There are other programming constructs available which are faster. I know Excel can run faster than this, but I don't know any tricks on how to speed this up. Why use code at all? You can do this with straight Excel formulas. Assuming your range "A" is located on Sheet1, Columns A and B, starting in Row 1; and assuming your range "B" is located in Column A starting in Row 1 on any other sheet, put this formula in B1 on that other sheet and copy it down... =IF(COUNTIF(Sheet1!A$1:A$500,A1),MATCH(A1,Sheet1!A $1:A$500,0),"") Rick Rothstein (MVP - Excel) |
#9
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
On May 22, 7:43*am, "Rick Rothstein"
put this formula in B1 on that other sheet and copy it down... =IF(COUNTIF(Sheet1!A$1:A$500,A1), MATCH(A1,Sheet1!A$1:A$500,0),"") Presumably you mean: =IF(COUNTIF(Sheet1!A$1:A$50000,A1), INDEX(Sheet1!B$1:B$50000,MATCH(A1,Sheet1!A$1:A$500 00,0)),"") Or =IF(COUNTIF(Sheet1!A$1:A$50000,A1), VLOOKUP(A1,Sheet1!A$1:B$50000,2,0),"") Since COUNTIF will always do 50000 comparisons, on average this does 75000 comparisons for each of 50000 entries in Sheet2!A1:A50000, a total of 3.75 billion comparisions versus 1.25 billion for Andrew's algorithm (corrected). Nonetheless, this __might__ be an improvement because it avoids the interprocess communication and process scheduling time between VBA and Excel. (But I would be surprised.) |
#10
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
put this formula in B1 on that other sheet
and copy it down... =IF(COUNTIF(Sheet1!A$1:A$500,A1), MATCH(A1,Sheet1!A$1:A$500,0),"") Presumably you mean: =IF(COUNTIF(Sheet1!A$1:A$50000,A1), INDEX(Sheet1!B$1:B$50000,MATCH(A1,Sheet1!A$1:A$500 00,0)),"") Or =IF(COUNTIF(Sheet1!A$1:A$50000,A1), VLOOKUP(A1,Sheet1!A$1:B$50000,2,0),"") I can't believe I forgot to included it... I was going for the INDEX formula version. Thanks for catching that. Rick Rothstein (MVP - Excel) |
#11
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
On Sun, 22 May 2011 06:34:35 -0700 (PDT), Andrew wrote:
On May 21, 7:43*pm, isabelle wrote: hi, i don't know why your looping is far too slow, it should not, can you show your code -- isabelle Le 2011-05-21 11:43, Andrew a écrit : Hello, Does anyone know a means of comparing two very large arrays (50000 elements) to determine which elements are common to both arrays? *I've tried using looping functions and that is far too slow. Isabel, Imagine one range (A) with 50,000 rows, 2 columns, a second range with 50,000 rows (B) 1 column. I want to find all the elements in B which have matches in column 1 of A. For each match, I want to copy the corresponding entry in A column 2 over to a new column 2 in B. Example. A= 1 a 2 f 3 u 4 w 5 q B= 5 1 6 8 9 After running the code, B would be transformed to: B= 5 q 1 a 8 9 6 To do this I create the following loops. For k1 = 1 to 50000 dataX=worksheets(1).cells(k1,1) For k2=1 to 50000 dataY= worksheets(2).cells(k2,1) If dataX=dataY then worksheets(1).cells(k1,2)=worksheets(2).cells(k2, 2) end if next next This code is accurate, but terrribly slow. This loop has to execute 50000^2 loops, 2.5 billion loops. Let's say that each iteration requires 20 CPU clock cycles, that's 50 billion clock cycles. My CPU has a 2.5 GHz processor, so this ideally would take 20 seconds. It takes more than that, about 45 seconds. If I were doing this in Matlab or C, I could process this much data in less than a second, but not because the CPU is faster. There are other programming constructs available which are faster. I know Excel can run faster than this, but I don't know any tricks on how to speed this up. Any ideas? If you don't want to do it with worksheet formulas, as suggested by Rick, I think the following code will only have to execute 50,000 loops, so should run quicker. It does assume that your RangeA is two columns and RangeB is one column, as you wrote. It also assumes that the data in RangeA is not the result of formulas, but that can obviously be changed in the code. The code also assumes that if there are any duplicates in RangeA column1, that you only want to return the first one for the corresponding RangeB entry. =================================== Option Explicit Sub RangeAtoB() Dim arB As Variant Dim i As Long Dim c As Range 'since RangeB is one column, must make it two arB = Range("RangeB").Resize(columnsize:=2) With Range("RangeA").Resize(columnsize:=1) For i = 1 To UBound(arB) Set c = .Find(what:=arB(i, 1), _ after:=.Item(1), _ LookIn:=xlValues, _ lookat:=xlWhole) If Not c Is Nothing Then arB(i, 2) = c.Offset(columnoffset:=1) Next i End With Range("rangeB").Resize(columnsize:=2) = arB End Sub ============================== |
#12
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
On May 22, 6:34*am, Andrew wrote:
For k1 = 1 to 50000 dataX=worksheets(1).cells(k1,1) For k2=1 to 50000 dataY= worksheets(2).cells(k2,1) If dataX=dataY then worksheets(1).cells(k1,2)=worksheets(2).cells(k2,2 ) end if next next [....] This loop has to execute 50000^2 loops, 2.5 billion loops. [....] I don't know any tricks on how to speed this up. Well, that depends on details of the data and algorithm that you have not mentioned. (Note: I will use Sheet1 for Worksheets(1) and Sheet2 for Worksheets(2), although those might not be the actual worksheet names.) First, your algorithm continues to look for a match even after finding one. Effectively, it finds the __last__ match. Is that a requirement? Or is it sufficient to find the __first__ match? If you want to find the last match, your inner for-loop can be improved by doing For k2=50000 to 1 step -1. In either case, the inner loop can stop after finding a match. So you can put the following statement before End If: Exit For. Thus, on average a match is found after 25000 comparisons instead of 50000, assuming unique data in both Sheet1!A1:A50000 and Sheet2! A1:A50000. That cuts the total iterations in half, down to 1.25 billion. Second, in your example, Sheet1!A1:A50000 is in ascending order. If that is always the case, you could use a binary search instead of linear search. Thus, a match is found in at most 16 comparisions. That cuts the total iterations down to 800,000 at most, with some probability that it is much less. Finally, it is much more efficient to copy Sheet1!A1:B50000 and Sheet2! A1:A50000 into a Variant variable one time, and to retain the results in a Variant variable and copy them to Sheet2!B1:B50000 one time. The time savings can be huge. Every time you access an Excel range, VBA and Excel processes much exchange information -- two interprocess communications -- and the Excel process must be scheduled to run while the VBA process waits. Moreover, even if Sheet1!A1:A50000 is not in ascending order, it is probably advantageous to take the time to sort dataX below, if we implemented a binary search (not included below). So at a minimum (without the binary search), your VBA algorithm should be: Dim dataX, dataY dataX = Worksheets(1).Range("A1:B50000") dataY = Worksheets(2).Range("A1:A50000") redim resultY(1 to 50000, 1) for k1 = 1 to 50000: for k2 = 1 to 50000 if dataY(k1,1) = dataX(k2,1) then resultY(k1,2) = dataX(k2,2) exit for end if next: next Worksheets(2).Range("B1:B50000") = resultY (Note that I reversed the order of the for-loops. It would make a difference if you implement a binary search of Sheet1!A1:A50000.) Other improvements are possible and probably desirable; for example, avoiding the hardcoded ranges. That is why I use Redim instead of Dim for resultY. By reducing the interprocess communications (copying of data between VBA and Excel), I would be very surprised if that VBA implementation fails to outperform any of the alternatives mentioned so far: VBA Match/Index, .Find, and Excel COUNTIF/INDEX/MATCH. (Although the Excel alternative might be more efficient if IFERROR can be used instead of COUNTIF.) I might post relative performance data later, if I get around to it. |
#13
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
Errata....
On May 22, 4:14*pm, I wrote: redim resultY(1 to 50000, 1) Should be: redim resultY(1 to 50000, 1 to 1) I wrote: resultY(k1,2) = dataX(k2,2) Should be: resultY(k1,1) = dataX(k2,2) I wrote: Every time you access an Excel range, VBA and Excel processes much exchange information -- two interprocess communications Should be: "must exchange" |
#14
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
On May 22, 4:14 pm, joeu2004 wrote:
Moreover, even if Sheet1!A1:A50000 is not in ascending order, it is probably advantageous to take the time to sort dataX below, if we implemented a binary search No! A selection sort requires nearly 1.250 billion (10^9) comparisions, overwhelming any benefit of the binary search. Excel sort can do that in a blink of an eye. But a VBA implementation takes "forever". Presumably VBA is significantly slower because it is interpreted, not compiled. I wrote: By reducing the interprocess communications (copying of data between VBA and Excel), I would be very surprised if that VBA implementation fails to outperform any of the alternatives mentioned so far: VBA Match/Index, .Find, and Excel COUNTIF/INDEX/MATCH. Color me "surprised"! When I made that comment, I forgot to account for the fact that VBA is interpreted, not compiled. Nonetheless, on my computer, a VBA binary search (50,000 times through 50,000 elements) is nearly 2.4 times faster than Isabelle's VBA Match algorithm (significantly improved and modified to use a binary search), and nearly 800 to 1200 times faster than Ron's VBA Find algorithm and Rick's Excel COUNTIF/MATCH formula (improved to use VLOOKUP and a binary search). However, Rick's Excel formula can be nearly 3.7 times faster than the VBA implementation if we use a helper column for the VLOOKUP (with a binary search) and IF(ISERROR) instead of COUNTIF. In XL2007 and later, I suspect we would get comparable results with IFERROR(VLOOKUP) without the need for a helper column. The improved VBA linear search is 1.8 to 2.2 times faster than Ron's VBA Find algorithm and Rick's Excel COUNTIF/MATCH formula (improved to use VLOOKUP, but a linear search). However, Isabelle's VBA Match algorithm (significantly improved, but a linear search) is nearly 5.8 times faster than the improved VBA linear search. Even Isabelle's obviously inefficient implementation (doing the Match twice) is nearly 2.9 times faster. And again, Rick's Excel formula can be nearly 6 times faster than the improved VBA linear search if we use a helper column for the VLOOKUP (with a linear search). YMMV significantly for a variety of reasons. Since my improved VBA linear search algorithm runs nearly 7 times longer than what Andrew claims ("45 seconds"), I suspect that my system is slow because of small memory, small cache or both relative to more modern systems. Also, my CPU is single-core and single-threaded (instruction "threads", not process threads). At first, I did not think that would be a factor, since VBA does not take advantage of multiple cores and instruction threads. However, Excel can, perhaps favoring Rick's approach. And I suspect it might improve Isabelle's and Ron's approaches for similar reasons; but that is based on speculation. |
#15
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
I really like reading your messages... the amount of analysis you are
willing to perform, and then write up for us to read, is simply awe-inspiring. Thank you for always taking the time to do this. Rick Rothstein (MVP - Excel) |
#16
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
"Andrew" wrote in message This loop has to execute 50000^2 loops, 2.5 billion loops. Let's say that each iteration requires 20 CPU clock cycles, that's 50 billion clock cycles. You've been given many suggestions, particularly about taking advantage of using Excel's built in functions. However which ever approach, with this type of thing it's really worthwhile to start with both arrays sorted, typically that's a one off task, or not much extra work when adding small amounts of data. Once both sorted, search for the first matching item (and further duplicates). Then search the second item starting at index+1 of the last item found. Obviously in any loop after finding a matching item, as soon as the next item doesn't match (ie after any duplicates) exit that loop and start the next. You should end up with a fraction of the amount of working you are doing now. If regularly trying to find individual items in a large sorted array, search "binary chop", which will find matches in negligible time even in large arrays. Peter Thornton |
#17
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
Peter T pretended :
"Andrew" wrote in message This loop has to execute 50000^2 loops, 2.5 billion loops. Let's say that each iteration requires 20 CPU clock cycles, that's 50 billion clock cycles. You've been given many suggestions, particularly about taking advantage of using Excel's built in functions. However which ever approach, with this type of thing it's really worthwhile to start with both arrays sorted, typically that's a one off task, or not much extra work when adding small amounts of data. Once both sorted, search for the first matching item (and further duplicates). Then search the second item starting at index+1 of the last item found. Obviously in any loop after finding a matching item, as soon as the next item doesn't match (ie after any duplicates) exit that loop and start the next. You should end up with a fraction of the amount of working you are doing now. If regularly trying to find individual items in a large sorted array, search "binary chop", which will find matches in negligible time even in large arrays. Peter Thornton Peter, I think the time it takes lies in the fact that *Arrays* aren't being compared, but rather the loops read the ranges directly. If the 2 ranges were dumped into arrays where at least 1 array was sorted, a binary search would cut the processing time considerably. Better, though, (as you say) that both arrays are sorted. The bottleneck is trying to loop the ranges directly while doing this in memory would run orders of magnitude faster.<IMO -- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc |
#18
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
On May 23, 5:42*pm, GS wrote:
Peter T pretended : "Andrew" wrote in message This loop has to execute 50000^2 loops, 2.5 billion loops. Let's say that each iteration requires 20 CPU clock cycles, that's 50 billion clock cycles. You've been given many suggestions, particularly about taking advantage of using Excel's built in functions. However which ever approach, with this type of thing it's really worthwhile to start with both arrays sorted, typically that's a one off task, or not much extra work when adding small amounts of data. Once both sorted, search for the first matching item (and further duplicates). Then search the second item starting at index+1 of the last item found. Obviously in any loop after finding a matching item, as soon as the next item doesn't match (ie after any duplicates) exit that loop and start the next. You should end up with a fraction of the amount of working you are doing now. If regularly trying to find individual items in a large sorted array, search "binary chop", which will find matches in negligible time even in large arrays. Peter Thornton Peter, I think the time it takes lies in the fact that *Arrays* aren't being compared, but rather the loops read the ranges directly. If the 2 ranges were dumped into arrays where at least 1 array was sorted, a binary search would cut the processing time considerably. Better, though, (as you say) that both arrays are sorted. The bottleneck is trying to loop the ranges directly while doing this in memory would run orders of magnitude faster.<IMO -- Garry Free usenet access athttp://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc There's some very good information in this thread. Thank you. I have yet to try these find and match methods. Although I did use VLOOKUP for a search and reduced the search time (for the same 50,000 rows of data) down to 3 seconds. In using VLOOKUP I used named ranges for the first two input arguments, the search parameter and the lookup table. Then I tried to do the same thing using arrays instead of ranges. It took at least one hundred times longer. I would think that it would be faster to sort through an array than a range, but clearly this is not true. thanks again, Andrew |
#19
Posted to microsoft.public.excel.programming
|
|||
|
|||
how to find union of two arrays
"GS" wrote in message Peter T pretended : "Andrew" wrote in message This loop has to execute 50000^2 loops, 2.5 billion loops. Let's say that each iteration requires 20 CPU clock cycles, that's 50 billion clock cycles. You've been given many suggestions, particularly about taking advantage of using Excel's built in functions. However which ever approach, with this type of thing it's really worthwhile to start with both arrays sorted, typically that's a one off task, or not much extra work when adding small amounts of data. Once both sorted, search for the first matching item (and further duplicates). Then search the second item starting at index+1 of the last item found. Obviously in any loop after finding a matching item, as soon as the next item doesn't match (ie after any duplicates) exit that loop and start the next. You should end up with a fraction of the amount of working you are doing now. If regularly trying to find individual items in a large sorted array, search "binary chop", which will find matches in negligible time even in large arrays. Peter Thornton Peter, I think the time it takes lies in the fact that *Arrays* aren't being compared, but rather the loops read the ranges directly. If the 2 ranges were dumped into arrays where at least 1 array was sorted, a binary search would cut the processing time considerably. Better, though, (as you say) that both arrays are sorted. The bottleneck is trying to loop the ranges directly while doing this in memory would run orders of magnitude faster.<IMO Garry, Indeed, reading cells is considerably longer than reading array values, though there will be some overhead initially reading 2x50k cells to a array or pair of arrays. It really is important to sort the first array, whether using built-in Excel functions or VBA. Also if using VBA extremely worthwhile to sort the 2nd array. That's because for each cell in the 2nd array, code will only need to look from the index+1 of the last match to the first item that doesn't match, so much less work than available to Excel's built-in functions. Despite Excel's normally faster methods, it's difficult to take advantage of that approach in Excel (although could be done) so VBA might be easier/faster. As for sorting, as you say in your adjacent post "that depends on how you do it". There are some extremely fast sort methods out there for VBA, some compare well if not better than Excel's built-in method (excluding the time to transfer from cells to array to cells), particularly if the data is all say doubles. However to keep things simple, maybe get VBA to leverage Excel's sort, then compare the sorted data in VBA as above. Without testing not sure which'd be fastest, VBA+Excel or just Excel. Peter T |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Find Min and Max in variable length arrays | Excel Programming | |||
Compare two arrays in VBA to find dupes | Excel Programming | |||
Arrays - declaration, adding values to arrays and calculation | Excel Programming | |||
UNION of Arrays - is possible? | Excel Discussion (Misc queries) | |||
Where can I find a template for a Credit Union account? | Excel Discussion (Misc queries) |