![]() |
Count Unique Text Values
I am trying to write a formula that will count the unique values in an
array. I have been able to get this far... Month Branch Item Jan A 1 Jan B 2 Feb C 1 Feb C 2 Feb A 1 What I need is the unique occurances of the Branch for each different Item. This is what I have written so far: =COUNT(1/FREQUENCY(IF((B1:B5=C1),MATCH(A1:A5,A1:A5,0)),ROW( 1:5))) I am just not sure if you can next another If Statement and/or how you would do that with this type of Array. Thanks for the help. |
Count Unique Text Values
-- KL [MVP - Microsoft Excel] RU: http://www.mvps.ru/Program/Default.aspx ES: http://mvp.support.microsoft.com/?LN=es-es EN: http://mvp.support.microsoft.com/?LN=en-us Profile: https://mvp.support.microsoft.com/pr...A-9E6C73C09A36 "Scott Halper" wrote in message oups.com... I am trying to write a formula that will count the unique values in an array. I have been able to get this far... Month Branch Item Jan A 1 Jan B 2 Feb C 1 Feb C 2 Feb A 1 What I need is the unique occurances of the Branch for each different Item. This is what I have written so far: =COUNT(1/FREQUENCY(IF((B1:B5=C1),MATCH(A1:A5,A1:A5,0)),ROW( 1:5))) I am just not sure if you can next another If Statement and/or how you would do that with this type of Array. Thanks for the help. |
Count Unique Text Values
try something like this:
if there are no empty cells in column B, then =SUMPRODUCT(--(FREQUENCY(MATCH(B1:B5,B1:B5,0),MATCH(B1:B5,B1:B5, 0)*(A1:A5="Jan")*(C1:C5=2))0)) if there are empty cells in column B and they need to be counted, then =SUMPRODUCT(--(FREQUENCY(MATCH(B1:B5&"",B1:B5&"",0),MATCH(B1:B5& "",B1:B5&"",0)*(A1:A5="Jan")*(C1:C5=2))0)) if there are empty cells in column B and they do not need to be counted, then =SUMPRODUCT(--(FREQUENCY(MATCH(B1:B5&"",B1:B5&"",0),MATCH(B1:B5& "",B1:B5&"",0)*(A1:A5="Jan")*(B1:B5<"")*(C1:C5=2) )0)) Attention, the formula may slow down you worksheet's recalc as it is calculation-intensive -- KL [MVP - Microsoft Excel] RU: http://www.mvps.ru/Program/Default.aspx ES: http://mvp.support.microsoft.com/?LN=es-es EN: http://mvp.support.microsoft.com/?LN=en-us Profile: https://mvp.support.microsoft.com/pr...A-9E6C73C09A36 "Scott Halper" wrote in message oups.com... I am trying to write a formula that will count the unique values in an array. I have been able to get this far... Month Branch Item Jan A 1 Jan B 2 Feb C 1 Feb C 2 Feb A 1 What I need is the unique occurances of the Branch for each different Item. This is what I have written so far: =COUNT(1/FREQUENCY(IF((B1:B5=C1),MATCH(A1:A5,A1:A5,0)),ROW( 1:5))) I am just not sure if you can next another If Statement and/or how you would do that with this type of Array. Thanks for the help. |
Count Unique Text Values
Assuming that A2:C6 contains the data, let E2 and E3 contain 1 and 2,
then try the following formula that needs to be confirmed with CONTROL+SHIFT+ENTER... F2, copied down: =COUNT(1/FREQUENCY(IF($C$2:$C$6=E2,IF($B$2:$B$6<"",MATCH(" ~"&$B$2:$B$6,$ B$2:$B$6&"",0))),ROW($B$2:$B$6)-ROW($B$2)+1)) Hope this helps! In article .com, "Scott Halper" wrote: I am trying to write a formula that will count the unique values in an array. I have been able to get this far... Month Branch Item Jan A 1 Jan B 2 Feb C 1 Feb C 2 Feb A 1 What I need is the unique occurances of the Branch for each different Item. This is what I have written so far: =COUNT(1/FREQUENCY(IF((B1:B5=C1),MATCH(A1:A5,A1:A5,0)),ROW( 1:5))) I am just not sure if you can next another If Statement and/or how you would do that with this type of Array. Thanks for the help. |
Count Unique Text Values
Hello,
I suggest to apply my UDF lfreq to your combined cells Item & Branch: Introduce a helper column D (Item_Branch): Enter into D2 =C1&" "&B1 and copy down. Then select a sufficiently big area anywhere and array-enter =lfreq(D2:D999) The function lfreq you can get from http://www.sulprobil.com/html/listfreq.html. Regards, Bernd |
Count Unique Text Values
Everyone,
I appreciate all your responses, however let me try to explain it with some more descriptive information. Month Item Branch Jan A 1 Jan B 2 Feb A 1 Feb A 2 Feb C 1 What I'm looking for is a formula that will tell me in the month of February Item A was sold in 2 unique branches (with branches have 1,000 of choices). Thanks for you help again. Scott |
Count Unique Text Values
Let E2 contain Jan, and F2 contain A, then try the following formula
that needs to be confirmed with CONTROL+SHIFT+ENTER... =SUM(IF(FREQUENCY(IF($A$2:$A$6=E2,IF($B$2:$B$6=F2, IF($C$2:$C$6<"",MATCH( "~"&$C$2:$C$6,$C$2:$C$6&"",0)))),ROW($C$2:$C$6 )-ROW($C$2)+1),1)) Hope this helps! In article om, "Scott Halper" wrote: Everyone, I appreciate all your responses, however let me try to explain it with some more descriptive information. Month Item Branch Jan A 1 Jan B 2 Feb A 1 Feb A 2 Feb C 1 What I'm looking for is a formula that will tell me in the month of February Item A was sold in 2 unique branches (with branches have 1,000 of choices). Thanks for you help again. Scott |
Count Unique Text Values
"KL" wrote...
try something like this: if there are no empty cells in column B, then =SUMPRODUCT(--(FREQUENCY(MATCH(B1:B5,B1:B5,0), MATCH(B1:B5,B1:B5,0)*(A1:A5="Jan")*(C1:C5=2))0 )) .... Even using the OP's data, Month Branch Item Jan A 1 Jan B 2 Feb C 1 Feb C 2 Feb A 1 your wonderful formula returns 2 even though only the second row matches the (A1:A5="Jan")*(C1:C5=2) criteria. Can you figure out why? Did you bother to test? Rhetorical - obviously not. |
Count Unique Text Values
Domenic wrote...
Let E2 contain Jan, and F2 contain A, then try the following formula that needs to be confirmed with CONTROL+SHIFT+ENTER... =SUM(IF(FREQUENCY(IF($A$2:$A$6=E2,IF($B$2:$B$6=F2 , IF($C$2:$C$6<"",MATCH("~"&$C$2:$C$6,$C$2:$C$6&"" ,0)))), ROW($C$2:$C$6)-ROW($C$2)+1),1)) .... There are shorter alternatives. Also an array formula, =COUNT(1/(MATCH(""&$B$2:$B$6,IF(($A$2:$A$6=E2)*($C$2:$C$6=F 2), ""&$B$2:$B$6,0),0)=ROW($B$2:$B$6)-MIN(ROW($B$2:$B$6))+1)) |
Count Unique Text Values
Thanks everyone for all your help!
|
Count Unique Text Values
"Harlan Grove" wrote
...Even using the OP's data,... your wonderful formula returns 2 even though only the second row matches the (A1:A5="Jan")*(C1:C5=2) criteria. Yup, you're right. Thanks for jumping in. Can you figure out why? Of course I can :-) It is the extra element (the count of values superior of the max interval). Did you bother to test? Rhetorical - obviously not. I did, but with my own randomly generated data based on the OP's pattern. It so happened that in my sample the max value was preceded by a 0 in the second argument something like this: FREQUENCY({1;2;3;3;1},{0;2;0;3;0}) - {0;3;0;2;0;0} ....so when it worked after a couple of random changes I wrongly concluded that the formula was good. Did you bother to consider this possibility before firing out your "verdict"? Rhetorical - obviously not, but thanks anyway ;-) -- KL [MVP - Microsoft Excel] RU: http://www.mvps.ru/Program/Default.aspx ES: http://mvp.support.microsoft.com/?LN=es-es EN: http://mvp.support.microsoft.com/?LN=en-us Profile: https://mvp.support.microsoft.com/pr...A-9E6C73C09A36 |
Count Unique Text Values
Thanks Harlan! As usual, very interesting. However, I noticed that the
COUNT formula is very inefficient. After an informal test where I increased the lower boundary for the range from Row 6 to Row 20000, here's what I found... COUNT formula ----- approximately 113 seconds to calculate FREQUENCY formula ----- approximately 3 seconds to calculate Although, I should mention that I'm using a Mac version of Excel. I don't know if this makes a difference. In article .com, "Harlan Grove" wrote: Domenic wrote... Let E2 contain Jan, and F2 contain A, then try the following formula that needs to be confirmed with CONTROL+SHIFT+ENTER... =SUM(IF(FREQUENCY(IF($A$2:$A$6=E2,IF($B$2:$B$6=F2 , IF($C$2:$C$6<"",MATCH("~"&$C$2:$C$6,$C$2:$C$6&"" ,0)))), ROW($C$2:$C$6)-ROW($C$2)+1),1)) ... There are shorter alternatives. Also an array formula, =COUNT(1/(MATCH(""&$B$2:$B$6,IF(($A$2:$A$6=E2)*($C$2:$C$6=F 2), ""&$B$2:$B$6,0),0)=ROW($B$2:$B$6)-MIN(ROW($B$2:$B$6))+1)) |
Count Unique Text Values
Domenic wrote...
Thanks Harlan! As usual, very interesting. However, I noticed that the COUNT formula is very inefficient. After an informal test where I increased the lower boundary for the range from Row 6 to Row 20000, here's what I found... COUNT formula ----- approximately 113 seconds to calculate FREQUENCY formula ----- approximately 3 seconds to calculate Although, I should mention that I'm using a Mac version of Excel. I don't know if this makes a difference. .... It does seem to make a difference. 20 iterations of your formula under Windows XP and Excel 2003 on my laptop takes about 5 seconds, while 20 iterations of my formula takes about 16 seconds. 3:1 rather than 30:1 difference. The constraint isn't the COUNT function, it's the array expression in MATCH's 2nd argument in my formula. Looks like FREQUENCY is best, but could still wrap COUNT around it, so =COUNT(1/FREQUENCY(IF(($A$2:$A$20000=E2)*($C$2:$C$20000=F2) , MATCH(""&$B$2:$B$20000,""&$B$2:$B$20000,0)), ROW($B$2:$B$20000)-MIN(ROW($B$2:$B$20000))+1)) rather than =SUM(IF(FREQUENCY(IF($A$2:$A$20000=E2,IF($C$2:$C$2 0000=F2, IF($B$2:$B$20000<"",MATCH("~"&$B$2:$B$20000,$B$2: $B$20000&"", 0)))),ROW($B$2:$B$20000)-ROW($B$2)+1),1)) For 100 iterations, yours takes 25 seconds while my COUNT-FREQUENCY formula takes 21 seconds. |
Count Unique Text Values
"KL" wrote...
.... Did you bother to consider this possibility before firing out your "verdict"? . . . Your formula failing with the OP's own sample data could be construed as untested. I did take into account light testing with randomly generated values leading to fortuitously correct results. Happens all the time. |
Count Unique Text Values
In article .com,
"Harlan Grove" wrote: It does seem to make a difference. 20 iterations of your formula under Windows XP and Excel 2003 on my laptop takes about 5 seconds, while 20 iterations of my formula takes about 16 seconds. 3:1 rather than 30:1 difference. Wow! There's a big difference between Windows and Mac versions of Excel. Very disappointing... The constraint isn't the COUNT function, it's the array expression in MATCH's 2nd argument in my formula. Looks like FREQUENCY is best, but could still wrap COUNT around it, so =COUNT(1/FREQUENCY(IF(($A$2:$A$20000=E2)*($C$2:$C$20000=F2) , MATCH(""&$B$2:$B$20000,""&$B$2:$B$20000,0)), ROW($B$2:$B$20000)-MIN(ROW($B$2:$B$20000))+1)) rather than =SUM(IF(FREQUENCY(IF($A$2:$A$20000=E2,IF($C$2:$C$2 0000=F2, IF($B$2:$B$20000<"",MATCH("~"&$B$2:$B$20000,$B$2: $B$20000&"", 0)))),ROW($B$2:$B$20000)-ROW($B$2)+1),1)) For 100 iterations, yours takes 25 seconds while my COUNT-FREQUENCY formula takes 21 seconds. Here I'm surprised. I thought 1/Array would be more expensive. Thanks Harlan! Much appreciated! |
Count Unique Text Values
Hi Scott,
Assuming that your data is in the range A1:C6... If you concatenate Month and Branch in Column D, by copying the following formula from this post into D2: =A2&","&B2 copy the formula from D2, and paste into D3:D6 Enter copy the following array formula from this post into E1: =IF(ROWS($1:1)<=SUM(1/COUNTIF(D$2:D$6,D$2:D$6)),INDEX(D$2:D$6,SMALL(IF(R OW(D$2:D$6)-MIN(ROW(D$2:D$6))+1=MATCH(D$2:D$6,D$2:D$6,0),ROW(D $2:D$6)-MIN(ROW(D$2:D$6))+1),ROWS($1:1))),"") When you enter the formula into the cell, press Ctrl and Shift while pressing Enter, and Excel will place curly brackets {} around the formula. Copy the formula from E1 and paste into E2 through E4. Note that this formula cannot have any blank cells in the D2:D6 range or it will produce a division error. Then copy the formula below from this post and paste into F1: =COUNT(1/FREQUENCY(IF($D$2:$D$6=E1,IF($C$2:$C$6<"",MATCH(" ~"&$C$2:$C$6,$C$2:$C$6&"",0))),ROW($C$2:$C$6)-ROW($C$2)+1)) When you enter the formula into the cell, press Ctrl and Shift while pressing Enter, and Excel will place curly brackets {} around the formula. Copy the formula from F1 and paste into F2 through F4. "Scott Halper" wrote: Everyone, I appreciate all your responses, however let me try to explain it with some more descriptive information. Month Item Branch Jan A 1 Jan B 2 Feb A 1 Feb A 2 Feb C 1 What I'm looking for is a formula that will tell me in the month of February Item A was sold in 2 unique branches (with branches have 1,000 of choices). Thanks for you help again. Scott |
Count Unique Text Values
Hi Scott,
Domenic's post worked for me. Here is a way to get a list of unique Item values, assuming the range for Item is C2:C6. Copy the formula down a column of cells until you get a blank. =IF(ROWS($1:1)<=SUM(1/COUNTIF(C$2:C$6,C$2:C$6)),INDEX(C$2:C$6,SMALL(IF(R OW(C$2:C$6)-MIN(ROW(C$2:C$6))+1=MATCH(C$2:C$6,C$2:C$6,0),ROW(C $2:C$6)-MIN(ROW(C$2:C$6))+1),ROWS($1:1))),"") "Scott Halper" wrote: I am trying to write a formula that will count the unique values in an array. I have been able to get this far... Month Branch Item Jan A 1 Jan B 2 Feb C 1 Feb C 2 Feb A 1 What I need is the unique occurances of the Branch for each different Item. This is what I have written so far: =COUNT(1/FREQUENCY(IF((B1:B5=C1),MATCH(A1:A5,A1:A5,0)),ROW( 1:5))) I am just not sure if you can next another If Statement and/or how you would do that with this type of Array. Thanks for the help. |
Count Unique Text Values
On Mar 27, 8:28 pm, "Harlan Grove" wrote:
Domenic wrote... Thanks Harlan! As usual, very interesting. However, I noticed that the COUNT formula is very inefficient. After an informal test where I increased the lower boundary for the range from Row 6 to Row 20000, here's what I found... COUNT formula ----- approximately 113 seconds to calculate FREQUENCY formula ----- approximately 3 seconds to calculate Although, I should mention that I'm using a Mac version of Excel. I don't know if this makes a difference. ... It does seem to make a difference. 20 iterations of your formula under Windows XP and Excel 2003 on my laptop takes about 5 seconds, while 20 iterations of my formula takes about 16 seconds. 3:1 rather than 30:1 difference. The constraint isn't the COUNT function, it's the array expression in MATCH's 2nd argument in my formula. Looks like FREQUENCY is best, but could still wrap COUNT around it, so =COUNT(1/FREQUENCY(IF(($A$2:$A$20000=E2)*($C$2:$C$20000=F2) , MATCH(""&$B$2:$B$20000,""&$B$2:$B$20000,0)), ROW($B$2:$B$20000)-MIN(ROW($B$2:$B$20000))+1)) rather than =SUM(IF(FREQUENCY(IF($A$2:$A$20000=E2,IF($C$2:$C$2 0000=F2, IF($B$2:$B$20000<"",MATCH("~"&$B$2:$B$20000,$B$2: $B$20000&"", 0)))),ROW($B$2:$B$20000)-ROW($B$2)+1),1)) For 100 iterations, yours takes 25 seconds while my COUNT-FREQUENCY formula takes 21 seconds. Adding the ~ bit to the COUNT formula seems to make both formula equally fast, as measured with FastExcel. BTW, I did not test the effects of * vs IF and MIN. |
Count Unique Text Values
Adding the ~ bit to the COUNT formula seems to make both formula
equally fast, as measured with FastExcel. Very interesting. I didn't think appending "" vs "~" would make much difference, but apparently it does. BTW, I did not test the effects of * vs IF and MIN. So I don't think we can draw any concrete conclusions, since we're not comparing apples to apples. In article . com, "Aladin Akyurek" wrote: On Mar 27, 8:28 pm, "Harlan Grove" wrote: Domenic wrote... Thanks Harlan! As usual, very interesting. However, I noticed that the COUNT formula is very inefficient. After an informal test where I increased the lower boundary for the range from Row 6 to Row 20000, here's what I found... COUNT formula ----- approximately 113 seconds to calculate FREQUENCY formula ----- approximately 3 seconds to calculate Although, I should mention that I'm using a Mac version of Excel. I don't know if this makes a difference. ... It does seem to make a difference. 20 iterations of your formula under Windows XP and Excel 2003 on my laptop takes about 5 seconds, while 20 iterations of my formula takes about 16 seconds. 3:1 rather than 30:1 difference. The constraint isn't the COUNT function, it's the array expression in MATCH's 2nd argument in my formula. Looks like FREQUENCY is best, but could still wrap COUNT around it, so =COUNT(1/FREQUENCY(IF(($A$2:$A$20000=E2)*($C$2:$C$20000=F2) , MATCH(""&$B$2:$B$20000,""&$B$2:$B$20000,0)), ROW($B$2:$B$20000)-MIN(ROW($B$2:$B$20000))+1)) rather than =SUM(IF(FREQUENCY(IF($A$2:$A$20000=E2,IF($C$2:$C$2 0000=F2, IF($B$2:$B$20000<"",MATCH("~"&$B$2:$B$20000,$B$2: $B$20000&"", 0)))),ROW($B$2:$B$20000)-ROW($B$2)+1),1)) For 100 iterations, yours takes 25 seconds while my COUNT-FREQUENCY formula takes 21 seconds. Adding the ~ bit to the COUNT formula seems to make both formula equally fast, as measured with FastExcel. BTW, I did not test the effects of * vs IF and MIN. |
All times are GMT +1. The time now is 07:19 AM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com