![]() |
Explanation of SUMPRODUCT
Could someone please explain how the following works:
=SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A1)) I understand this calculates the number of unique values in a list so if A1:A5 contains 1 and A6:A10 contains 2 the answer is 2 (items). For example, I don't understand how COUNTIF works in this situation i.e where the comparator is an array. In a recent question, Bob Phillips used SUMPRODUCT to convert an alpha string e.g. abcde to a numbers (12345) and summed them. Any explanation of how this works would also be appreciated. I have looked at the XLD site article on SUMPRODUCT and understand the basics but these more advanced uses are baffling me! Many thanks in advance. John |
Explanation of SUMPRODUCT
John,
Here goes with my attempt. Let's start by defining the range A1:A10 to talk specifics. Bob,John,Bob,Bob,John,Jon,Bob,Bill,Bill,Max In the example that you show, which should be (at least) =SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A10))) by the way, the COUNTIF with an array value is being used to count the number of each of times the value in each cell is repeated. So in this case, A1 holds Bob which is repeated 4 times, so that part of the COUNTIF returns 4. A2 holds John, so COUNTIF returns 3 for A2. But A3 also holds Bob, which also returns 4. And so on, with return values of 4,3,3,4,2,1 (I'll leave that to you to work through). So the COUNTIF returns an array of {4,3,4,4,3,3,4,2,1}. The array results of the COUNTIF are then divided into 1 to get a fractional value of each element of the array. This is the part that effectively does the counting, as the 4 instances of Bob each return an array element of 4, which when divided into 1, each give 0.25, which when added together gives 1. Voila. So the array returned by 1/(COUNTIF(A1:A10,A1:A10)) is {0.25;0.33333;0.25;0.25;0.33333;0.33333;0.25;0.5;0 .5;1} SUMPRODUCT then adds these up to come up with the number of unique entries, 4 in this case, because each separate value in the test range sums to 1. I don't recall the case that you mention, but I do recall a formula where I used SUMPRODUCT to add the values at the end of a mixed text/number field, like Mike 10. The formula in question was =SUMPRODUCT(--(MID(A1:A10,FIND*("~",SUBSTITUTE(A1:A10,"%%%","~", LEN(A1:A10)- LEN(SUBSTITU*TE(A1:A10," ",""))))+1,99))) Not that the %%% should be a single space, but I put %%% in as the NG wraps at that point and can lose it. What this is doing is simply using a tried and trusted formula, MID(A1,FIND*("~",SUBSTITUTE(A1,"","~",LEN(A1)-LEN(SUBSTITU*TE(A1," ",""))))+1,99), but this time on a range not a single cell, to get the last part of a delimited string, delimited by space in this case. So in Mike 10, it returns 10, In Text String 18 it returns 18. SUMPRODUCT is then used just to sum the results. BTW, to get a better understanding of what goes on in these things, use the F9 to evaluate the formula. In the formula bar, select the part of the formula you wish to evaluate, press F9, and you see the results. Hit Esc to exit. -- HTH RP (remove nothere from the email address if mailing direct) "Toppers" wrote in message ... Could someone please explain how the following works: =SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A1)) I understand this calculates the number of unique values in a list so if A1:A5 contains 1 and A6:A10 contains 2 the answer is 2 (items). For example, I don't understand how COUNTIF works in this situation i.e where the comparator is an array. In a recent question, Bob Phillips used SUMPRODUCT to convert an alpha string e.g. abcde to a numbers (12345) and summed them. Any explanation of how this works would also be appreciated. I have looked at the XLD site article on SUMPRODUCT and understand the basics but these more advanced uses are baffling me! Many thanks in advance. John |
Explanation of SUMPRODUCT
Bob,
Many thannks for the prompt reply and excellent explanation (plus F9 tip). All is much clearer now! I never cease to be amazed by the ingenuity shown by the experts like yourself on using the features/functions in Excel ... but it's great to have you all available online. Again many thanks - your help is really appreciated. John "Bob Phillips" wrote: John, Here goes with my attempt. Let's start by defining the range A1:A10 to talk specifics. Bob,John,Bob,Bob,John,Jon,Bob,Bill,Bill,Max In the example that you show, which should be (at least) =SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A10))) by the way, the COUNTIF with an array value is being used to count the number of each of times the value in each cell is repeated. So in this case, A1 holds Bob which is repeated 4 times, so that part of the COUNTIF returns 4. A2 holds John, so COUNTIF returns 3 for A2. But A3 also holds Bob, which also returns 4. And so on, with return values of 4,3,3,4,2,1 (I'll leave that to you to work through). So the COUNTIF returns an array of {4,3,4,4,3,3,4,2,1}. The array results of the COUNTIF are then divided into 1 to get a fractional value of each element of the array. This is the part that effectively does the counting, as the 4 instances of Bob each return an array element of 4, which when divided into 1, each give 0.25, which when added together gives 1. Voila. So the array returned by 1/(COUNTIF(A1:A10,A1:A10)) is {0.25;0.33333;0.25;0.25;0.33333;0.33333;0.25;0.5;0 .5;1} SUMPRODUCT then adds these up to come up with the number of unique entries, 4 in this case, because each separate value in the test range sums to 1. I don't recall the case that you mention, but I do recall a formula where I used SUMPRODUCT to add the values at the end of a mixed text/number field, like Mike 10. The formula in question was =SUMPRODUCT(--(MID(A1:A10,FINDĀ*("~",SUBSTITUTE(A1:A10,"%%%","~" ,LEN(A1:A10)- LEN(SUBSTITUĀ*TE(A1:A10," ",""))))+1,99))) Not that the %%% should be a single space, but I put %%% in as the NG wraps at that point and can lose it. What this is doing is simply using a tried and trusted formula, MID(A1,FINDĀ*("~",SUBSTITUTE(A1,"","~",LEN(A1)-LEN(SUBSTITUĀ*TE(A1," ",""))))+1,99), but this time on a range not a single cell, to get the last part of a delimited string, delimited by space in this case. So in Mike 10, it returns 10, In Text String 18 it returns 18. SUMPRODUCT is then used just to sum the results. BTW, to get a better understanding of what goes on in these things, use the F9 to evaluate the formula. In the formula bar, select the part of the formula you wish to evaluate, press F9, and you see the results. Hit Esc to exit. -- HTH RP (remove nothere from the email address if mailing direct) "Toppers" wrote in message ... Could someone please explain how the following works: =SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A1)) I understand this calculates the number of unique values in a list so if A1:A5 contains 1 and A6:A10 contains 2 the answer is 2 (items). For example, I don't understand how COUNTIF works in this situation i.e where the comparator is an array. In a recent question, Bob Phillips used SUMPRODUCT to convert an alpha string e.g. abcde to a numbers (12345) and summed them. Any explanation of how this works would also be appreciated. I have looked at the XLD site article on SUMPRODUCT and understand the basics but these more advanced uses are baffling me! Many thanks in advance. John |
Explanation of SUMPRODUCT
"Bob Phillips" wrote:
In the example that you show, which should be (at least) =SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A10))) In this version of the formula, I don't think you need SUMPRODUCT. SUM will do, since you're dealing with a single array and you aren't doing any dot-producting. I don't even think you need to enter it as an array formula. But neither the SUMPRODUCT or SUM version will work if one of the cells is blank. For the purposes of exploring this SUMPRODUCT construct a little further, let's say we take Bill's example and erase the contents of the cell that has "Jon" in it. The data in A1:A10 is then Bob,John,Bob,Bob,John, ,Bob,Bill,Bill,Max Here's a parsing of another popular version of this formula, one that ignores blank cells: =SUM((A1:A10<"")/COUNTIF(A1:A10,A1:A10&"")) Piece by piece, here's what that says: COUNTIF(A1:A10,A1:A10), as Bob explained, produces a 10-element array where each element is the number of times the corresponding cell value appears in the range A1:A10. The extra &"" at the end is a string concatenation. You're adding a zero-length string onto whatever was in each cell. That converts empty cells, which Excel interprets as the number 0, to empty strings. Why do you want to do that? Because if you represent empty cells as 0, then you'll end up with zero in the denominator of the 1/COUNTIF() expression. And you know what happens when you do that. So the COUNTIF(A1:A10.A1:A10&"") part of the expression evaluates to: {4;2;4;4;2;1;4;2;2;1}. If we didn't have the &"", the array would have been {4;2;4;4;2;0;4;2;2;1}. Notice the zero. Why is that zero there? Because COUNTIF promotes an empty cell to 0 if the cell is in the 'criteria' argument, but not if the empty cell is in the 'range' argument position. Try this: put the number 0 in B1 and B2. Put COUNTIF(B1,B2) in cell B3. Then try deleting the zeros in B1 and B2 one at a time. Watch how it affects B3. (Let me know if you find the documentation on this.) Now, back to the numerator of our array-over-array fraction: we've got (A1:A10<"") We're checking whether each cell in A1:A10 is not an empty cell. That evaluates to an array of Booleans: {TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;TRUE;TRUE;TRUE;TRU E} In arithmetic expressions, Excel interprets TRUE as 1 and FALSE as 0. You will sometimes see Booleans preceded by a double minus, --, which will explicitly force a Boolean-to-integer conversion. So for each entry representing a non-empty cell, we've got 1/(the number of cells with that value in them) and for empty cells we've got 0/(the number of empty cells). Add up the resulting array of fractions and you end up with a count of unique non-empty values in the range A1:A10. I find that I can get away without using an array formula if I use SUMPRODUCT to add the array, but not if I use SUM. If I use SUM, I have to enter the formula as an array formula. I suppose SUMPRODUCT knows how to handle an array divided by an array piecewise, while SUM doesn't. |
Explanation of SUMPRODUCT
"Shawn O'Donnell" wrote in message
... "Bob Phillips" wrote: In the example that you show, which should be (at least) =SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A10))) In this version of the formula, I don't think you need SUMPRODUCT. SUM will do, since you're dealing with a single array and you aren't doing any dot-producting. I don't even think you need to enter it as an array formula. Yes, we know this, but in this case SUM is an array formula SUMPRODUCT isn't. But neither the SUMPRODUCT or SUM version will work if one of the cells is blank. For the purposes of exploring this SUMPRODUCT construct a little further, let's say we take Bill's example and erase the contents of the cell that has "Jon" in it. The data in A1:A10 is then Bob,John,Bob,Bob,John, ,Bob,Bill,Bill,Max Here's a parsing of another popular version of this formula, one that ignores blank cells: =SUM((A1:A10<"")/COUNTIF(A1:A10,A1:A10&"")) Yes we know all this too, and it works just as well with SUMPRODUCT, which is till not an array formula.As I recall, this technique was first suggested with SP not SUM . Piece by piece, here's what that says: COUNTIF(A1:A10,A1:A10), as Bob explained, produces a 10-element array where each element is the number of times the corresponding cell value appears in the range A1:A10. The extra &"" at the end is a string concatenation. You're adding a zero-length string onto whatever was in each cell. That converts empty cells, which Excel interprets as the number 0, to empty strings. Why do you want to do that? Because if you represent empty cells as 0, then you'll end up with zero in the denominator of the 1/COUNTIF() expression. And you know what happens when you do that. So the COUNTIF(A1:A10.A1:A10&"") part of the expression evaluates to: {4;2;4;4;2;1;4;2;2;1}. If we didn't have the &"", the array would have been {4;2;4;4;2;0;4;2;2;1}. Notice the zero. You are not adding a "" onto whatever was ikn each cell, buut simply on to the criteria values. Why is that zero there? Because COUNTIF promotes an empty cell to 0 if the cell is in the 'criteria' argument, but not if the empty cell is in the 'range' argument position. Try this: put the number 0 in B1 and B2. Put COUNTIF(B1,B2) in cell B3. Then try deleting the zeros in B1 and B2 one at a time. Watch how it affects B3. (Let me know if you find the documentation on this.) I assume tyhat you mean =COUNTIF(B1:B2,B2) snip I find that I can get away without using an array formula if I use SUMPRODUCT to add the array, but not if I use SUM. If I use SUM, I have to enter the formula as an array formula. I suppose SUMPRODUCT knows how to handle an array divided by an array piecewise, while SUM doesn't. It is not SUM that is the problem here but COUNTIF. SUM can work on arrays. SUMPRODUCT can work on arrays. COUNTIF expects to work on a range array, but a criteria value. To work on an array where a single value, be that a hard-coded value or a celle reference, you have to use an array formula. |
Explanation of SUMPRODUCT
"Bob Phillips" wrote:
I assume tyhat you mean =COUNTIF(B1:B2,B2) Thanks for the reply. I meant COUNTIF(B1,B2), since that's the easiest way to show how COUNTIF treats empty cells differently, depending on whether the emtpy cell is in the 'array' or 'criteria' argument. It's instructive to look at the four possibilities for B1 & B2, then compare the results with the same arguments but using COUNTIF(B1,B2&""). It is not SUM that is the problem here but COUNTIF. SUM can work on arrays. SUMPRODUCT can work on arrays. COUNTIF expects to work on a range array, but a criteria value. To work on an array where a single value, be that a hard-coded value or a celle reference, you have to use an array formula. I don't follow you there. How is it that SUMPRODUCT gets to be an array formula regardless of how you enter it, but SUM has to be entered with Ctl-Shift-Enter to become an array formula? All that differs in the two versions is the context provided by SUM & SUMPRODUCT. Both functions accept arrays, but SUM doesn't seem to provide an array context for evaluating functions in its arguments, while SUMPRODUCT does. Here's another example. Say you wanted to know the total number of letters in A1:A10. =SUM(LEN(A1:A10)) will work only if entered as an array formula, but =SUMPRODUCT(LEN(A1:A10)) will work either way. If, in the SUM example, you select LEN(A1:A10) and hit F9 to evaluate the LEN() function, you create an array that SUM can digest, even in a non-array formula. But SUM by itself isn't able to interpret the LEN() as an array without the whole formula being entered as an array formula. SUMPRODUCT can. Do you know of any documentation concerning this array-producing context for evaluating scalar functions on ranges of cells (without having to enter the formula as an array formula)? Is SUMPRODUCT the only function that provides this context? I can imagine using this feature in user-defined functions (if possible,) but it would be nice if I could learn more about the phenomenon first. |
Explanation of SUMPRODUCT
"Shawn O'Donnell" wrote in message ... Hi Shawn, I meant COUNTIF(B1,B2), since that's the easiest way to show how COUNTIF treats empty cells differently, depending on whether the emtpy cell is in the 'array' or 'criteria' argument. It's instructive to look at the four possibilities for B1 & B2, then compare the results with the same arguments but using COUNTIF(B1,B2&""). Well as I said, with COUNTIF(B1,B2) I didn't get the point you were making, but with COUNTIF(B1:B2,B2) I thought I did. It is not SUM that is the problem here but COUNTIF. SUM can work on arrays. SUMPRODUCT can work on arrays. COUNTIF expects to work on a range array, but a criteria value. To work on an array where a single value, be that a hard-coded value or a celle reference, you have to use an array formula. I don't follow you there. How is it that SUMPRODUCT gets to be an array formula regardless of how you enter it, but SUM has to be entered with Ctl-Shift-Enter to become an array formula? What I mean is that SUM will work on an array anyway, COUNTIF will not work on an array criteria (directly that is). So it SUM doesn;y need to be array entered nor does SUMPRODUCT, but if you want the COUNTIF criteria to be a range, the whole thing gets array entered. All that differs in the two versions is the context provided by SUM & SUMPRODUCT. Both functions accept arrays, but SUM doesn't seem to provide an array context for evaluating functions in its arguments, while SUMPRODUCT does. No, again you are looking at the wrongt function. Here's another example. Say you wanted to know the total number of letters in A1:A10. =SUM(LEN(A1:A10)) will work only if entered as an array formula, but =SUMPRODUCT(LEN(A1:A10)) will work either way. If, in the SUM example, you select LEN(A1:A10) and hit F9 to evaluate the LEN() function, you create an array that SUM can digest, even in a non-array formula. But SUM by itself isn't able to interpret the LEN() as an array without the whole formula being entered as an array formula. SUMPRODUCT can. Yes but try =SUM(IF(LEFT(A1:A10,1)="a",LEN(A1:A10))). That doesn't work unless you array enter it. Again, it is not SUM that causes this but IF. =SUMPRODUCT(--(LEFT(A1:A10,1)="a"),LEN(A1:A10)) works non-array entered. Do you know of any documentation concerning this array-producing context for evaluating scalar functions on ranges of cells (without having to enter the formula as an array formula)? Bob Umlas wrote the best paper that I have read. You can get it at http://www.emailoffice.com/excel/arrays-bobumlas.html Is SUMPRODUCT the only function that provides this context? Depends upon what you mean. SUMPRODUCT is not the only function that can work upon ranges/arrays, but it is the only non-array function capable of multiple condition tests. Read more about it at http://www.xldynamic.com/source/xld.SUMPRODUCT.html |
Explanation of SUMPRODUCT
Hi Bob,
The link you provided is very usefull. I learnt a lot. Not only SUMPRODUCT but other functions also. Thanks. Bob Phillips wrote: "Shawn O'Donnell" wrote in message ... Bob Umlas wrote the best paper that I have read. You can get it at http://www.emailoffice.com/excel/arrays-bobumlas.html http://www.xldynamic.com/source/xld.SUMPRODUCT.html |
Explanation of SUMPRODUCT
Hi Madiya,
Glad it helped. Bob "Madiya" wrote in message oups.com... Hi Bob, The link you provided is very usefull. I learnt a lot. Not only SUMPRODUCT but other functions also. Thanks. Bob Phillips wrote: "Shawn O'Donnell" wrote in message ... Bob Umlas wrote the best paper that I have read. You can get it at http://www.emailoffice.com/excel/arrays-bobumlas.html http://www.xldynamic.com/source/xld.SUMPRODUCT.html |
All times are GMT +1. The time now is 05:27 AM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com