View Single Post
  #19   Report Post  
Posted to microsoft.public.excel.programming
GS[_2_] GS[_2_] is offline
external usenet poster
 
Posts: 3,514
Default how to find union of two arrays

Peter T pretended :
"Andrew" wrote in message

This loop has to execute 50000^2 loops, 2.5 billion loops.
Let's say that each iteration requires 20 CPU clock
cycles, that's 50 billion clock cycles.


You've been given many suggestions, particularly about taking advantage of
using Excel's built in functions. However which ever approach, with this type
of thing it's really worthwhile to start with both arrays sorted, typically
that's a one off task, or not much extra work when adding small amounts of
data.

Once both sorted, search for the first matching item (and further
duplicates). Then search the second item starting at index+1 of the last item
found. Obviously in any loop after finding a matching item, as soon as the
next item doesn't match (ie after any duplicates) exit that loop and start
the next. You should end up with a fraction of the amount of working you are
doing now.

If regularly trying to find individual items in a large sorted array, search
"binary chop", which will find matches in negligible time even in large
arrays.

Peter Thornton


Peter,
I think the time it takes lies in the fact that *Arrays* aren't being
compared, but rather the loops read the ranges directly.

If the 2 ranges were dumped into arrays where at least 1 array was
sorted, a binary search would cut the processing time considerably.
Better, though, (as you say) that both arrays are sorted. The
bottleneck is trying to loop the ranges directly while doing this in
memory would run orders of magnitude faster.<IMO

--
Garry

Free usenet access at http://www.eternal-september.org
ClassicVB Users Regroup! comp.lang.basic.visual.misc