Count Unique Text Values
Adding the ~ bit to the COUNT formula seems to make both formula
equally fast, as measured with FastExcel.
Very interesting. I didn't think appending "" vs "~" would make much
difference, but apparently it does.
BTW, I did not test the
effects of * vs IF and MIN.
So I don't think we can draw any concrete conclusions, since we're not
comparing apples to apples.
In article . com,
"Aladin Akyurek" wrote:
On Mar 27, 8:28 pm, "Harlan Grove" wrote:
Domenic wrote...
Thanks Harlan! As usual, very interesting. However, I noticed
that the COUNT formula is very inefficient. After an informal
test where I increased the lower boundary for the range from Row 6
to Row 20000, here's what I found...
COUNT formula ----- approximately 113 seconds to calculate
FREQUENCY formula ----- approximately 3 seconds to calculate
Although, I should mention that I'm using a Mac version of Excel.
I don't know if this makes a difference.
...
It does seem to make a difference. 20 iterations of your formula under
Windows XP and Excel 2003 on my laptop takes about 5 seconds, while 20
iterations of my formula takes about 16 seconds. 3:1 rather than 30:1
difference.
The constraint isn't the COUNT function, it's the array expression in
MATCH's 2nd argument in my formula. Looks like FREQUENCY is best, but
could still wrap COUNT around it, so
=COUNT(1/FREQUENCY(IF(($A$2:$A$20000=E2)*($C$2:$C$20000=F2) ,
MATCH(""&$B$2:$B$20000,""&$B$2:$B$20000,0)),
ROW($B$2:$B$20000)-MIN(ROW($B$2:$B$20000))+1))
rather than
=SUM(IF(FREQUENCY(IF($A$2:$A$20000=E2,IF($C$2:$C$2 0000=F2,
IF($B$2:$B$20000<"",MATCH("~"&$B$2:$B$20000,$B$2: $B$20000&"",
0)))),ROW($B$2:$B$20000)-ROW($B$2)+1),1))
For 100 iterations, yours takes 25 seconds while my COUNT-FREQUENCY
formula takes 21 seconds.
Adding the ~ bit to the COUNT formula seems to make both formula
equally fast, as measured with FastExcel. BTW, I did not test the
effects of * vs IF and MIN.
|