View Single Post
  #19   Report Post  
Posted to microsoft.public.excel.misc,microsoft.public.excel.programming,microsoft.public.excel
Biff Biff is offline
external usenet poster
 
Posts: 1,688
Default searching a large database with a long list of search terms

Here are the results of 3 tests:

Computer specs:

Pentium P4, 2.0 ghz, 256 Mb ram, WinXP (all service packs, all patches),
Excel 2002 (XP) (all service packs)
Other than the operating system, Excel is the only app running.

File configuration: (based on the OPs description)

7 sheets total, 1 summary, 6 data

Summary sheet(1): 3 columns x 3000 rows. Lookup values in column A, A1:A3000
Data sheets(6): 2 columns x 65536 rows

Test 1 (based on the reply from Max)

File size (wo/formulas) - 26.6 Mb
File size (w/formulas) - 27.5 Mb

This formula was copied to 6 columns x 3000 rows:

=IF($A1="","",IF(ISNA(MATCH($A1,INDIRECT("'"&COLUM N(A1)&"'!A:A"),0)),"NO
ENTRY",VLOOKUP($A1,INDIRECT("'"&COLUMN(A1)&"'!A:B" ),2,0)))

I was unable to copy/drag in a single operation. When I tried, Excel
"froze-up". I had to use Task Manager to regain control. Tried twice and
Excel "froze" both times. I had to drag copy in increments of ~200 rows at a
time. I didn't time this but to copy to all 3000 rows took at least an hour.
(calculation was on automatic) After all formulas were copied:

Calc time (data sheets unsorted) ~6:45 (m:ss)
Calc time (data sheets sorted ascending) ~3:30 (m:ss)

Test 2

Deleted all the above formulas, reset the used range.

Used this array formula copied to 1 column x 3000 rows:

=IF(SUMPRODUCT(COUNTIF(INDIRECT("'"&WSlist&"'!A:A" ),A1)),VLOOKUP(A1,INDIRECT("'"&INDEX(WSlist,MATCH( TRUE,COUNTIF(INDIRECT("'"&WSlist&"'!A:A"),A1)0,0) )&"'!A:B"),2,0),"")

After all formulas were copied:

File size (w/formulas) - 27.2 Mb
Calc time (data sheets sorted ascending) ~3:35 (m:ss)

Test 3

Deleted all the above formulas. Decided to try a monster nested IF formula
but I hit the nested function limit so I split the formula into 2 cells. I
cell formula did the lookup on sheets 2,3,4. The other cell formula did the
lookup on sheets 5,6,7.

=IF(NOT(ISERROR(VLOOKUP(A1,Sheet2!A:B,2,0))),VLOOK UP(A1,Sheet2!A:B,2,0),IF(NOT(ISERROR(VLOOKUP(A1,Sh eet3!A:B,2,0))),VLOOKUP(A1,Sheet3!A:B,2,0),IF(NOT( ISERROR(VLOOKUP(A1,Sheet4!A:B,2,0))),VLOOKUP(A1,Sh eet4!A:B,2,0),"")))

=IF(D1="",IF(NOT(ISERROR(VLOOKUP(A1,Sheet5!A:B,2,0 ))),VLOOKUP(A1,Sheet5!A:B,2,0),IF(NOT(ISERROR(VLOO KUP(A1,Sheet6!A:B,2,0))),VLOOKUP(A1,Sheet6!A:B,2,0 ),IF(NOT(ISERROR(VLOOKUP(A1,Sheet7!A:B,2,0))),VLOO KUP(A1,Sheet7!A:B,2,0),"No
Entry"))),"")

After all formulas were copied:

File size (w/formulas) - 28.2 Mb
Calc time (data sheets sorted ascending) ~1 second

I did not test using unsorted data sheets in tests 2 and 3.

Conclusion:

Sorting the data can speed up calc time significantly in "large" files. The
use of 1000's of volatile functions should be avoided at all costs! Monster
formulas aren't all bad!

Comments/suggestions welcome!

Biff

"Biff" wrote in message
...
If anyone is still following this thread I'll do some tests and post the
results. Stay tuned!

Biff

"Pete_UK" wrote in message
ups.com...
Max,

The OP didn't get back to me when I asked about sorting the reference
data beforehand. Can you sort your random data in the 6 sheets then
re-apply your formula to take advantage of this to see if there is a
big increase in speed? In theory, the binary search technique applied
if the data is sorted should make a massive difference to 6 * 65536
entries.

Pete