ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Explanation of SUMPRODUCT (https://www.excelbanter.com/excel-programming/325603-explanation-sumproduct.html)

Toppers

Explanation of SUMPRODUCT
 
Could someone please explain how the following works:

=SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A1))

I understand this calculates the number of unique values in a list so if
A1:A5 contains 1 and A6:A10 contains 2 the answer is 2 (items). For example,
I don't understand how COUNTIF works in this situation i.e where the
comparator is an array.

In a recent question, Bob Phillips used SUMPRODUCT to convert an alpha
string e.g. abcde to a numbers (12345) and summed them. Any explanation of
how this works would also be appreciated.

I have looked at the XLD site article on SUMPRODUCT and understand the
basics but these more advanced uses are baffling me!

Many thanks in advance.

John


Bob Phillips[_6_]

Explanation of SUMPRODUCT
 
John,

Here goes with my attempt.

Let's start by defining the range A1:A10 to talk specifics.

Bob,John,Bob,Bob,John,Jon,Bob,Bill,Bill,Max

In the example that you show, which should be (at least)
=SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A10))) by the way, the COUNTIF with an
array value is being used to count the number of each of times the value in
each cell is repeated.
So in this case, A1 holds Bob which is repeated 4 times, so that part of the
COUNTIF returns 4.
A2 holds John, so COUNTIF returns 3 for A2.
But A3 also holds Bob, which also returns 4.
And so on, with return values of 4,3,3,4,2,1 (I'll leave that to you to work
through).
So the COUNTIF returns an array of {4,3,4,4,3,3,4,2,1}.
The array results of the COUNTIF are then divided into 1 to get a fractional
value of each element of the array. This is the part that effectively does
the counting, as the 4 instances of Bob each return an array element of 4,
which when divided into 1, each give 0.25, which when added together gives
1. Voila.
So the array returned by 1/(COUNTIF(A1:A10,A1:A10)) is
{0.25;0.33333;0.25;0.25;0.33333;0.33333;0.25;0.5;0 .5;1}
SUMPRODUCT then adds these up to come up with the number of unique entries,
4 in this case, because each separate value in the test range sums to 1.

I don't recall the case that you mention, but I do recall a formula where I
used SUMPRODUCT to add the values at the end of a mixed text/number field,
like Mike 10. The formula in question was

=SUMPRODUCT(--(MID(A1:A10,FIND*("~",SUBSTITUTE(A1:A10,"%%%","~", LEN(A1:A10)-
LEN(SUBSTITU*TE(A1:A10," ",""))))+1,99)))

Not that the %%% should be a single space, but I put %%% in as the NG wraps
at that point and can lose it.

What this is doing is simply using a tried and trusted formula,
MID(A1,FIND*("~",SUBSTITUTE(A1,"","~",LEN(A1)-LEN(SUBSTITU*TE(A1,"
",""))))+1,99), but this time on a range not a single cell, to get the last
part of a delimited string, delimited by space in this case. So in Mike 10,
it returns 10, In Text String 18 it returns 18. SUMPRODUCT is then used just
to sum the results.

BTW, to get a better understanding of what goes on in these things, use the
F9 to evaluate the formula. In the formula bar, select the part of the
formula you wish to evaluate, press F9, and you see the results. Hit Esc to
exit.

--

HTH

RP
(remove nothere from the email address if mailing direct)


"Toppers" wrote in message
...
Could someone please explain how the following works:

=SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A1))

I understand this calculates the number of unique values in a list so if
A1:A5 contains 1 and A6:A10 contains 2 the answer is 2 (items). For

example,
I don't understand how COUNTIF works in this situation i.e where the
comparator is an array.

In a recent question, Bob Phillips used SUMPRODUCT to convert an alpha
string e.g. abcde to a numbers (12345) and summed them. Any explanation

of
how this works would also be appreciated.

I have looked at the XLD site article on SUMPRODUCT and understand the
basics but these more advanced uses are baffling me!

Many thanks in advance.

John




Toppers

Explanation of SUMPRODUCT
 
Bob,
Many thannks for the prompt reply and excellent explanation (plus F9
tip). All is much clearer now!

I never cease to be amazed by the ingenuity shown by the experts like
yourself on using the features/functions in Excel ... but it's great to have
you all available online.

Again many thanks - your help is really appreciated.

John


"Bob Phillips" wrote:

John,

Here goes with my attempt.

Let's start by defining the range A1:A10 to talk specifics.

Bob,John,Bob,Bob,John,Jon,Bob,Bill,Bill,Max

In the example that you show, which should be (at least)
=SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A10))) by the way, the COUNTIF with an
array value is being used to count the number of each of times the value in
each cell is repeated.
So in this case, A1 holds Bob which is repeated 4 times, so that part of the
COUNTIF returns 4.
A2 holds John, so COUNTIF returns 3 for A2.
But A3 also holds Bob, which also returns 4.
And so on, with return values of 4,3,3,4,2,1 (I'll leave that to you to work
through).
So the COUNTIF returns an array of {4,3,4,4,3,3,4,2,1}.
The array results of the COUNTIF are then divided into 1 to get a fractional
value of each element of the array. This is the part that effectively does
the counting, as the 4 instances of Bob each return an array element of 4,
which when divided into 1, each give 0.25, which when added together gives
1. Voila.
So the array returned by 1/(COUNTIF(A1:A10,A1:A10)) is
{0.25;0.33333;0.25;0.25;0.33333;0.33333;0.25;0.5;0 .5;1}
SUMPRODUCT then adds these up to come up with the number of unique entries,
4 in this case, because each separate value in the test range sums to 1.

I don't recall the case that you mention, but I do recall a formula where I
used SUMPRODUCT to add the values at the end of a mixed text/number field,
like Mike 10. The formula in question was

=SUMPRODUCT(--(MID(A1:A10,FINDĀ*("~",SUBSTITUTE(A1:A10,"%%%","~" ,LEN(A1:A10)-
LEN(SUBSTITUĀ*TE(A1:A10," ",""))))+1,99)))

Not that the %%% should be a single space, but I put %%% in as the NG wraps
at that point and can lose it.

What this is doing is simply using a tried and trusted formula,
MID(A1,FINDĀ*("~",SUBSTITUTE(A1,"","~",LEN(A1)-LEN(SUBSTITUĀ*TE(A1,"
",""))))+1,99), but this time on a range not a single cell, to get the last
part of a delimited string, delimited by space in this case. So in Mike 10,
it returns 10, In Text String 18 it returns 18. SUMPRODUCT is then used just
to sum the results.

BTW, to get a better understanding of what goes on in these things, use the
F9 to evaluate the formula. In the formula bar, select the part of the
formula you wish to evaluate, press F9, and you see the results. Hit Esc to
exit.

--

HTH

RP
(remove nothere from the email address if mailing direct)


"Toppers" wrote in message
...
Could someone please explain how the following works:

=SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A1))

I understand this calculates the number of unique values in a list so if
A1:A5 contains 1 and A6:A10 contains 2 the answer is 2 (items). For

example,
I don't understand how COUNTIF works in this situation i.e where the
comparator is an array.

In a recent question, Bob Phillips used SUMPRODUCT to convert an alpha
string e.g. abcde to a numbers (12345) and summed them. Any explanation

of
how this works would also be appreciated.

I have looked at the XLD site article on SUMPRODUCT and understand the
basics but these more advanced uses are baffling me!

Many thanks in advance.

John





Shawn O'Donnell

Explanation of SUMPRODUCT
 
"Bob Phillips" wrote:
In the example that you show, which should be (at least)
=SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A10)))


In this version of the formula, I don't think you need SUMPRODUCT. SUM will
do, since you're dealing with a single array and you aren't doing any
dot-producting. I don't even think you need to enter it as an array formula.


But neither the SUMPRODUCT or SUM version will work if one of the cells is
blank.

For the purposes of exploring this SUMPRODUCT construct a little further,
let's say we take Bill's example and erase the contents of the cell that has
"Jon" in it. The data in A1:A10 is then Bob,John,Bob,Bob,John,
,Bob,Bill,Bill,Max

Here's a parsing of another popular version of this formula, one that
ignores blank cells:

=SUM((A1:A10<"")/COUNTIF(A1:A10,A1:A10&""))

Piece by piece, here's what that says:

COUNTIF(A1:A10,A1:A10), as Bob explained, produces a 10-element array where
each element is the number of times the corresponding cell value appears in
the range A1:A10.

The extra &"" at the end is a string concatenation. You're adding a
zero-length string onto whatever was in each cell. That converts empty
cells, which Excel interprets as the number 0, to empty strings. Why do you
want to do that? Because if you represent empty cells as 0, then you'll end
up with zero in the denominator of the 1/COUNTIF() expression. And you know
what happens when you do that. So the COUNTIF(A1:A10.A1:A10&"") part of the
expression evaluates to: {4;2;4;4;2;1;4;2;2;1}. If we didn't have the &"",
the array would have been {4;2;4;4;2;0;4;2;2;1}. Notice the zero.

Why is that zero there? Because COUNTIF promotes an empty cell to 0 if the
cell is in the 'criteria' argument, but not if the empty cell is in the
'range' argument position. Try this: put the number 0 in B1 and B2. Put
COUNTIF(B1,B2) in cell B3. Then try deleting the zeros in B1 and B2 one at a
time. Watch how it affects B3. (Let me know if you find the documentation
on this.)

Now, back to the numerator of our array-over-array fraction: we've got
(A1:A10<"") We're checking whether each cell in A1:A10 is not an empty
cell. That evaluates to an array of Booleans:
{TRUE;TRUE;TRUE;TRUE;TRUE;FALSE;TRUE;TRUE;TRUE;TRU E}

In arithmetic expressions, Excel interprets TRUE as 1 and FALSE as 0. You
will sometimes see Booleans preceded by a double minus, --, which will
explicitly force a Boolean-to-integer conversion.

So for each entry representing a non-empty cell, we've got

1/(the number of cells with that value in them)

and for empty cells we've got

0/(the number of empty cells).

Add up the resulting array of fractions and you end up with a count of
unique non-empty values in the range A1:A10.


I find that I can get away without using an array formula if I use
SUMPRODUCT to add the array, but not if I use SUM. If I use SUM, I have to
enter the formula as an array formula. I suppose SUMPRODUCT knows how to
handle an array divided by an array piecewise, while SUM doesn't.



Bob Phillips[_6_]

Explanation of SUMPRODUCT
 
"Shawn O'Donnell" wrote in message
...
"Bob Phillips" wrote:
In the example that you show, which should be (at least)
=SUMPRODUCT(1/(COUNTIF(A1:A10,A1:A10)))


In this version of the formula, I don't think you need SUMPRODUCT. SUM

will
do, since you're dealing with a single array and you aren't doing any
dot-producting. I don't even think you need to enter it as an array

formula.

Yes, we know this, but in this case SUM is an array formula SUMPRODUCT
isn't.

But neither the SUMPRODUCT or SUM version will work if one of the cells is
blank.

For the purposes of exploring this SUMPRODUCT construct a little further,
let's say we take Bill's example and erase the contents of the cell that

has
"Jon" in it. The data in A1:A10 is then Bob,John,Bob,Bob,John,
,Bob,Bill,Bill,Max

Here's a parsing of another popular version of this formula, one that
ignores blank cells:

=SUM((A1:A10<"")/COUNTIF(A1:A10,A1:A10&""))


Yes we know all this too, and it works just as well with SUMPRODUCT, which
is till not an array formula.As I recall, this technique was first suggested
with SP not SUM .

Piece by piece, here's what that says:

COUNTIF(A1:A10,A1:A10), as Bob explained, produces a 10-element array

where
each element is the number of times the corresponding cell value appears

in
the range A1:A10.

The extra &"" at the end is a string concatenation. You're adding a
zero-length string onto whatever was in each cell. That converts empty
cells, which Excel interprets as the number 0, to empty strings. Why do

you
want to do that? Because if you represent empty cells as 0, then you'll

end
up with zero in the denominator of the 1/COUNTIF() expression. And you

know
what happens when you do that. So the COUNTIF(A1:A10.A1:A10&"") part of

the
expression evaluates to: {4;2;4;4;2;1;4;2;2;1}. If we didn't have the

&"",
the array would have been {4;2;4;4;2;0;4;2;2;1}. Notice the zero.


You are not adding a "" onto whatever was ikn each cell, buut simply on to
the criteria values.

Why is that zero there? Because COUNTIF promotes an empty cell to 0 if

the
cell is in the 'criteria' argument, but not if the empty cell is in the
'range' argument position. Try this: put the number 0 in B1 and B2. Put
COUNTIF(B1,B2) in cell B3. Then try deleting the zeros in B1 and B2 one

at a
time. Watch how it affects B3. (Let me know if you find the

documentation
on this.)


I assume tyhat you mean =COUNTIF(B1:B2,B2)

snip


I find that I can get away without using an array formula if I use
SUMPRODUCT to add the array, but not if I use SUM. If I use SUM, I have

to
enter the formula as an array formula. I suppose SUMPRODUCT knows how to
handle an array divided by an array piecewise, while SUM doesn't.


It is not SUM that is the problem here but COUNTIF. SUM can work on arrays.
SUMPRODUCT can work on arrays. COUNTIF expects to work on a range array, but
a criteria value. To work on an array where a single value, be that a
hard-coded value or a celle reference, you have to use an array formula.



Shawn O'Donnell

Explanation of SUMPRODUCT
 
"Bob Phillips" wrote:
I assume tyhat you mean =COUNTIF(B1:B2,B2)


Thanks for the reply.

I meant COUNTIF(B1,B2), since that's the easiest way to show how COUNTIF
treats empty cells differently, depending on whether the emtpy cell is in the
'array' or 'criteria' argument. It's instructive to look at the four
possibilities for B1 & B2, then compare the results with the same arguments
but using COUNTIF(B1,B2&"").

It is not SUM that is the problem here but COUNTIF. SUM can work
on arrays. SUMPRODUCT can work on arrays. COUNTIF expects
to work on a range array, but a criteria value. To work on an array
where a single value, be that a hard-coded value or a celle reference,
you have to use an array formula.


I don't follow you there. How is it that SUMPRODUCT gets to be an array
formula regardless of how you enter it, but SUM has to be entered with
Ctl-Shift-Enter to become an array formula?

All that differs in the two versions is the context provided by SUM &
SUMPRODUCT. Both functions accept arrays, but SUM doesn't seem to provide an
array context for evaluating functions in its arguments, while SUMPRODUCT
does.

Here's another example. Say you wanted to know the total number of letters
in A1:A10.

=SUM(LEN(A1:A10)) will work only if entered as an array formula, but
=SUMPRODUCT(LEN(A1:A10)) will work either way.

If, in the SUM example, you select LEN(A1:A10) and hit F9 to evaluate the
LEN() function, you create an array that SUM can digest, even in a non-array
formula. But SUM by itself isn't able to interpret the LEN() as an array
without the whole formula being entered as an array formula. SUMPRODUCT can.

Do you know of any documentation concerning this array-producing context for
evaluating scalar functions on ranges of cells (without having to enter the
formula as an array formula)?

Is SUMPRODUCT the only function that provides this context?

I can imagine using this feature in user-defined functions (if possible,)
but it would be nice if I could learn more about the phenomenon first.


Bob Phillips[_6_]

Explanation of SUMPRODUCT
 

"Shawn O'Donnell" wrote in message
...

Hi Shawn,

I meant COUNTIF(B1,B2), since that's the easiest way to show how COUNTIF
treats empty cells differently, depending on whether the emtpy cell is in

the
'array' or 'criteria' argument. It's instructive to look at the four
possibilities for B1 & B2, then compare the results with the same

arguments
but using COUNTIF(B1,B2&"").


Well as I said, with COUNTIF(B1,B2) I didn't get the point you were making,
but with COUNTIF(B1:B2,B2) I thought I did.

It is not SUM that is the problem here but COUNTIF. SUM can work
on arrays. SUMPRODUCT can work on arrays. COUNTIF expects
to work on a range array, but a criteria value. To work on an array
where a single value, be that a hard-coded value or a celle reference,
you have to use an array formula.


I don't follow you there. How is it that SUMPRODUCT gets to be an array
formula regardless of how you enter it, but SUM has to be entered with
Ctl-Shift-Enter to become an array formula?


What I mean is that SUM will work on an array anyway, COUNTIF will not work
on an array criteria (directly that is). So it SUM doesn;y need to be array
entered nor does SUMPRODUCT, but if you want the COUNTIF criteria to be a
range, the whole thing gets array entered.

All that differs in the two versions is the context provided by SUM &
SUMPRODUCT. Both functions accept arrays, but SUM doesn't seem to provide

an
array context for evaluating functions in its arguments, while SUMPRODUCT
does.


No, again you are looking at the wrongt function.

Here's another example. Say you wanted to know the total number of

letters
in A1:A10.

=SUM(LEN(A1:A10)) will work only if entered as an array formula, but
=SUMPRODUCT(LEN(A1:A10)) will work either way.

If, in the SUM example, you select LEN(A1:A10) and hit F9 to evaluate the
LEN() function, you create an array that SUM can digest, even in a

non-array
formula. But SUM by itself isn't able to interpret the LEN() as an array
without the whole formula being entered as an array formula. SUMPRODUCT

can.

Yes but try =SUM(IF(LEFT(A1:A10,1)="a",LEN(A1:A10))). That doesn't work
unless you array enter it. Again, it is not SUM that causes this but IF.

=SUMPRODUCT(--(LEFT(A1:A10,1)="a"),LEN(A1:A10)) works non-array entered.

Do you know of any documentation concerning this array-producing context

for
evaluating scalar functions on ranges of cells (without having to enter

the
formula as an array formula)?


Bob Umlas wrote the best paper that I have read. You can get it at
http://www.emailoffice.com/excel/arrays-bobumlas.html

Is SUMPRODUCT the only function that provides this context?


Depends upon what you mean. SUMPRODUCT is not the only function that can
work upon ranges/arrays, but it is the only non-array function capable of
multiple condition tests. Read more about it at

http://www.xldynamic.com/source/xld.SUMPRODUCT.html



Madiya

Explanation of SUMPRODUCT
 
Hi Bob,
The link you provided is very usefull. I learnt a lot. Not only
SUMPRODUCT but other functions also.
Thanks.

Bob Phillips wrote:
"Shawn O'Donnell" wrote in

message
...


Bob Umlas wrote the best paper that I have read. You can get it at
http://www.emailoffice.com/excel/arrays-bobumlas.html



http://www.xldynamic.com/source/xld.SUMPRODUCT.html



Bob Phillips[_6_]

Explanation of SUMPRODUCT
 
Hi Madiya,

Glad it helped.

Bob


"Madiya" wrote in message
oups.com...
Hi Bob,
The link you provided is very usefull. I learnt a lot. Not only
SUMPRODUCT but other functions also.
Thanks.

Bob Phillips wrote:
"Shawn O'Donnell" wrote in

message
...


Bob Umlas wrote the best paper that I have read. You can get it at
http://www.emailoffice.com/excel/arrays-bobumlas.html



http://www.xldynamic.com/source/xld.SUMPRODUCT.html






All times are GMT +1. The time now is 05:27 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com