Thread: R2 Help Needed
View Single Post
  #1   Report Post  
Posted to microsoft.public.excel.misc
Jerry W. Lewis
 
Posts: n/a
Default R2 Help Needed

" wrote:

One need only look at the mathematical formulas -- the one on
the RSQ help page and yours -- to see that there is no reason
to expect them to match, at least not without an elaborate proof.
(IMHO, it is "obvious" that they would not normally match.)

....
Yes. But this is an example where "the whole might not equal
the sum of its parts".

RSQ is a measure of correlation between two sets of data -- for
example, actual sales v. target sales in each of several weeks.
RSQ is a value between 0 and 1. It is not an average deviation.


Exactly

A simple example where overall RSQ obviously need not equal the weighted
average of subgroup RSQ's is the case where each subgroup has only 2
non-identical observations (x,y pairs). In that case, RSQ=1 for each
subgroup, but there is no necessary requirement that the overall RSQ of the
six observations will also be one.

More generally, if we define
M = AVERAGE(data)
SS = DEVSQ(data)
SC = COUNT(data)*COVAR(xdata,ydata)
then
RSQ(xdata,ydata) = SC(xdata,ydata)^2/(SS(xdata)*SS(ydata))

If we have three subgroups, then with
M = M1*n1/n +M2*n2/n +M3*n3/n
SSx = (SSx1+SSx2+SSx3 +n1*(Mx1-Mx)^2 +n2*(Mx2-Mx)^2 +n3*(Mx3-Mx)^2
SSy = (SSy1+SSy2+SSy3 +n1*(My1-My)^2 +n2*(My2-My)^2 +n3*(My3-My)^2
SC = (SC1+SC2+SC3 +n1*(Mx1-Mx)*(My1-My) +n2*(Mx2-Mx)*(My2-My)
+n3*(Mx3-Mx)*(My3-My)

If we greatly simplify by assuming that n1=n2=n3, Mx1=Mx2=Mx3 and
My1=My2=My3, then the overall RSQ reduces to
RSQ = (SC1+SC2+SC3)^2/((SSx1+SSx2+SSx3)*(SSx1+SSx2+SSx3))
which still need not equal
AVERAGE( SC1^2/(SSx1*SSy1), SC2^2/(SSx2*SSy2), SC3^2/(SSx3*SSy3) )

Jerry