View Single Post
  #5   Report Post  
Posted to microsoft.public.excel.programming
Rick Rothstein Rick Rothstein is offline
external usenet poster
 
Posts: 5,934
Default how to find union of two arrays

Imagine one range (A) with 50,000 rows, 2 columns, a second range with
50,000 rows (B) 1 column. I want to find all the elements in B which
have matches in column 1 of A. For each match, I want to copy the
corresponding entry in A column 2 over to a new column 2 in B.

Example.
A=
1 a
2 f
3 u
4 w
5 q

B=
5
1
6
8
9


After running the code, B would be transformed to:
B=
5 q
1 a
8
9
6

To do this I create the following loops.

For k1 = 1 To 50000
dataX = Worksheets(1).Cells(k1, 1)
For k2 = 1 To 50000
dataY = Worksheets(2).Cells(k2, 1)

If dataX = dataY Then
Worksheets(1).Cells(k1, 2) = Worksheets(2).Cells(k2, 2)
End If

Next
Next

This code is accurate, but terrribly slow.

This loop has to execute 50000^2 loops, 2.5 billion loops. Let's say
that each iteration requires 20 CPU clock cycles, that's 50 billion
clock cycles. My CPU has a 2.5 GHz processor, so this ideally would
take 20 seconds. It takes more than that, about 45 seconds. If I
were doing this in Matlab or C, I could process this much data in less
than a second, but not because the CPU is faster. There are other
programming constructs available which are faster. I know Excel can
run faster than this, but I don't know any tricks on how to speed this
up.


Why use code at all? You can do this with straight Excel formulas. Assuming
your range "A" is located on Sheet1, Columns A and B, starting in Row 1; and
assuming your range "B" is located in Column A starting in Row 1 on any
other sheet, put this formula in B1 on that other sheet and copy it down...

=IF(COUNTIF(Sheet1!A$1:A$500,A1),MATCH(A1,Sheet1!A $1:A$500,0),"")

Rick Rothstein (MVP - Excel)