Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Filtration Guy
 
Posts: n/a
Default linest r-squared bug when calculating forced through zero?

I am using linest to calculate a simple y = mx + b regression with an n of
57 values. I also have it return the r^2 value. Linest gives the same
values as the chart trendline r^2, and my manual calculation of r^2, when I
set constant to "TRUE". However if I set constant to "FALSE", and thus force
the line through zero, linest returns an r^2 value that is higher than the
chart trendline value (0.9909 vs 0.99750).
Excel Help mentions that the degrees of freedom change when setting
constant to false, but my manual calculation of "Degrees of Freedom Adjusted
R-Square" does not give the linest value, it gives 0.099746.
Is there a bug in linest with respect to r2 values of forced lines?
Thanks.

-Filtration Guy
  #2   Report Post  
Jerry W. Lewis
 
Posts: n/a
Default linest r-squared bug when calculating forced through zero?

http://support.microsoft.com/kb/828533/

In Excel 2002 and earlier LINEST returned an incorrect R^2 value if
constant=FALSE.

Jerry

"Filtration Guy" wrote:

I am using linest to calculate a simple y = mx + b regression with an n of
57 values. I also have it return the r^2 value. Linest gives the same
values as the chart trendline r^2, and my manual calculation of r^2, when I
set constant to "TRUE". However if I set constant to "FALSE", and thus force
the line through zero, linest returns an r^2 value that is higher than the
chart trendline value (0.9909 vs 0.99750).
Excel Help mentions that the degrees of freedom change when setting
constant to false, but my manual calculation of "Degrees of Freedom Adjusted
R-Square" does not give the linest value, it gives 0.099746.
Is there a bug in linest with respect to r2 values of forced lines?
Thanks.

-Filtration Guy

  #3   Report Post  
Filtration Guy
 
Posts: n/a
Default linest r-squared bug when calculating forced through zero?

Thanks for your reply. I checked my version and it says that it is
"Microsoft Office Excel 2003 (11.6355.6360) SP1" "Part of Microsoft Office
Professional Edition 2003"
So maybe this problem was sort of but not quite fixed. Since posting my
original question I have found information that tells that the old error was
obvious (returned numbers not between 0 and 1), but this one seems to be off
only by a little bit.

-Filter Guy


"Jerry W. Lewis" wrote:

http://support.microsoft.com/kb/828533/

In Excel 2002 and earlier LINEST returned an incorrect R^2 value if
constant=FALSE.

Jerry

"Filtration Guy" wrote:

I am using linest to calculate a simple y = mx + b regression with an n of
57 values. I also have it return the r^2 value. Linest gives the same
values as the chart trendline r^2, and my manual calculation of r^2, when I
set constant to "TRUE". However if I set constant to "FALSE", and thus force
the line through zero, linest returns an r^2 value that is higher than the
chart trendline value (0.9909 vs 0.99750).
Excel Help mentions that the degrees of freedom change when setting
constant to false, but my manual calculation of "Degrees of Freedom Adjusted
R-Square" does not give the linest value, it gives 0.099746.
Is there a bug in linest with respect to r2 values of forced lines?
Thanks.

-Filtration Guy

  #4   Report Post  
Jerry W. Lewis
 
Posts: n/a
Default linest r-squared bug when calculating forced through zero?

Post example data (within body of reply, not attachments, please).

Jerry

Filtration Guy wrote:

Thanks for your reply. I checked my version and it says that it is
"Microsoft Office Excel 2003 (11.6355.6360) SP1" "Part of Microsoft Office
Professional Edition 2003"
So maybe this problem was sort of but not quite fixed. Since posting my
original question I have found information that tells that the old error was
obvious (returned numbers not between 0 and 1), but this one seems to be off
only by a little bit.

-Filter Guy


"Jerry W. Lewis" wrote:


  #5   Report Post  
Filtration Guy
 
Posts: n/a
Default linest r-squared bug when calculating forced through zero?

To see the strange behaviour of linest forced r2 do the following:

Paste into A1:B10

x y
0.298 0.001
1.204 0.018
2.109 0.024
3.015 0.035
3.921 0.046
4.827 0.057
5.732 0.063
6.638 0.069
7.544 0.08

Plot the data in a chart. Plot two trendlines, with equation and r2 shown,
the first not forced, the second forced through zero.

In D21:E23 enter the following unforced array formula:
=LINEST(B2:B10,A2:A10,TRUE,TRUE)

In G21:H23 enter the following forced array formula:
=LINEST(B2:B10,A2:A10,FALSE,TRUE)

To manually calculate r2 for unforced:
(note K1 is the r2 value)
K1 =1-SUM(K2:K10)/SUM(L2:L10)
K2 and copy down to K10 =(B2-E$21-D$21*A2)^2
L2 and copy down to L10 =(B2-AVERAGE(B$2:B$10))^2

To manually calculate r2 for forced (this is not accounting for df change
due to forcing)
(note M1 is the r2 value)
M1 =1-SUM(M2:M10)/SUM(N2:N10)
M2 and copy down to M10 =(B2-H$21-G$21*A2)^2
N2 and copy down to N10 =(B2-AVERAGE(B$2:B$10))^2


The m and b values of the line fit agree in both cases.
The six decimal place unforced r2 in all three cases is returned as 0.988291
The six decimal place forced r2 on the chart and manual calculation returns
0.985336, the linest returns 0.996461.

I am not sure what the propper correction for the change in df due to
forcing is or if it matter at all. The only on that I could find is as
follows: df corrected r2 = 1-(1-r2)*n/(n-1)
This correction, in M13
=1-(1-M1)*COUNT(A2:A10)/(COUNT(A2:A10)-1)
yields: 0.983503, which is not what the forced linest gives.

Thanks, I appreciate any help that you can give.

-Filtration Guy


"Jerry W. Lewis" wrote:

Post example data (within body of reply, not attachments, please).

Jerry

Filtration Guy wrote:

Thanks for your reply. I checked my version and it says that it is
"Microsoft Office Excel 2003 (11.6355.6360) SP1" "Part of Microsoft Office
Professional Edition 2003"
So maybe this problem was sort of but not quite fixed. Since posting my
original question I have found information that tells that the old error was
obvious (returned numbers not between 0 and 1), but this one seems to be off
only by a little bit.

-Filter Guy


"Jerry W. Lewis" wrote:





  #6   Report Post  
Jerry W. Lewis
 
Posts: n/a
Default linest r-squared bug when calculating forced through zero?

Thank you, you have discovered that when MS corrected Excel 2003's
LINEST for regressions forced through the origin, they failed to also
correct the R^2 for the chart trendline. I will file that information
away for testing of future versions.

Your expectations for how R^2 should be calculated when forced through
the origin should be modified, see for example NIST's certified
calculations for forced regressions
http://www.itl.nist.gov/div898/strd/...ATA/NoInt1.dat
http://www.itl.nist.gov/div898/strd/...ATA/NoInt2.dat

Jerry

Filtration Guy wrote:

To see the strange behaviour of linest forced r2 do the following:

Paste into A1:B10

x y
0.298 0.001
1.204 0.018
2.109 0.024
3.015 0.035
3.921 0.046
4.827 0.057
5.732 0.063
6.638 0.069
7.544 0.08

Plot the data in a chart. Plot two trendlines, with equation and r2 shown,
the first not forced, the second forced through zero.

In D21:E23 enter the following unforced array formula:
=LINEST(B2:B10,A2:A10,TRUE,TRUE)

In G21:H23 enter the following forced array formula:
=LINEST(B2:B10,A2:A10,FALSE,TRUE)

To manually calculate r2 for unforced:
(note K1 is the r2 value)
K1 =1-SUM(K2:K10)/SUM(L2:L10)
K2 and copy down to K10 =(B2-E$21-D$21*A2)^2
L2 and copy down to L10 =(B2-AVERAGE(B$2:B$10))^2

To manually calculate r2 for forced (this is not accounting for df change
due to forcing)
(note M1 is the r2 value)
M1 =1-SUM(M2:M10)/SUM(N2:N10)
M2 and copy down to M10 =(B2-H$21-G$21*A2)^2
N2 and copy down to N10 =(B2-AVERAGE(B$2:B$10))^2


The m and b values of the line fit agree in both cases.
The six decimal place unforced r2 in all three cases is returned as 0.988291
The six decimal place forced r2 on the chart and manual calculation returns
0.985336, the linest returns 0.996461.

I am not sure what the propper correction for the change in df due to
forcing is or if it matter at all. The only on that I could find is as
follows: df corrected r2 = 1-(1-r2)*n/(n-1)
This correction, in M13
=1-(1-M1)*COUNT(A2:A10)/(COUNT(A2:A10)-1)
yields: 0.983503, which is not what the forced linest gives.

Thanks, I appreciate any help that you can give.

-Filtration Guy


"Jerry W. Lewis" wrote:


Post example data (within body of reply, not attachments, please).

Jerry

Filtration Guy wrote:


Thanks for your reply. I checked my version and it says that it is
"Microsoft Office Excel 2003 (11.6355.6360) SP1" "Part of Microsoft Office
Professional Edition 2003"
So maybe this problem was sort of but not quite fixed. Since posting my
original question I have found information that tells that the old error was
obvious (returned numbers not between 0 and 1), but this one seems to be off
only by a little bit.

-Filter Guy


"Jerry W. Lewis" wrote:



  #7   Report Post  
Filtration Guy
 
Posts: n/a
Default Chart fixed r2 has a bug, use LINEST instead!

Thanks again for your help with this.

I was able to sift through the NIST website and find the propper formula for
fixed r2: 1 - the quotent of the unexplained error and sum of the y
squareds. So to correct my example:
N2 should be: =B2^2 and copy this down through N10
This should give an r2 of 0.996461, the same result as the fixed linest!
I hope that the chart fixed r2 does get corrected in future versions but for
now USE LINEST!!!

-Filtration Guy



"Jerry W. Lewis" wrote:

Thank you, you have discovered that when MS corrected Excel 2003's
LINEST for regressions forced through the origin, they failed to also
correct the R^2 for the chart trendline. I will file that information
away for testing of future versions.

Your expectations for how R^2 should be calculated when forced through
the origin should be modified, see for example NIST's certified
calculations for forced regressions
http://www.itl.nist.gov/div898/strd/...ATA/NoInt1.dat
http://www.itl.nist.gov/div898/strd/...ATA/NoInt2.dat

Jerry

Filtration Guy wrote:

To see the strange behaviour of linest forced r2 do the following:

Paste into A1:B10

x y
0.298 0.001
1.204 0.018
2.109 0.024
3.015 0.035
3.921 0.046
4.827 0.057
5.732 0.063
6.638 0.069
7.544 0.08

Plot the data in a chart. Plot two trendlines, with equation and r2 shown,
the first not forced, the second forced through zero.

In D21:E23 enter the following unforced array formula:
=LINEST(B2:B10,A2:A10,TRUE,TRUE)

In G21:H23 enter the following forced array formula:
=LINEST(B2:B10,A2:A10,FALSE,TRUE)

To manually calculate r2 for unforced:
(note K1 is the r2 value)
K1 =1-SUM(K2:K10)/SUM(L2:L10)
K2 and copy down to K10 =(B2-E$21-D$21*A2)^2
L2 and copy down to L10 =(B2-AVERAGE(B$2:B$10))^2

To manually calculate r2 for forced (this is not accounting for df change
due to forcing)
(note M1 is the r2 value)
M1 =1-SUM(M2:M10)/SUM(N2:N10)
M2 and copy down to M10 =(B2-H$21-G$21*A2)^2
N2 and copy down to N10 =(B2-AVERAGE(B$2:B$10))^2


The m and b values of the line fit agree in both cases.
The six decimal place unforced r2 in all three cases is returned as 0.988291
The six decimal place forced r2 on the chart and manual calculation returns
0.985336, the linest returns 0.996461.

I am not sure what the propper correction for the change in df due to
forcing is or if it matter at all. The only on that I could find is as
follows: df corrected r2 = 1-(1-r2)*n/(n-1)
This correction, in M13
=1-(1-M1)*COUNT(A2:A10)/(COUNT(A2:A10)-1)
yields: 0.983503, which is not what the forced linest gives.

Thanks, I appreciate any help that you can give.

-Filtration Guy


"Jerry W. Lewis" wrote:


Post example data (within body of reply, not attachments, please).

Jerry

Filtration Guy wrote:


Thanks for your reply. I checked my version and it says that it is
"Microsoft Office Excel 2003 (11.6355.6360) SP1" "Part of Microsoft Office
Professional Edition 2003"
So maybe this problem was sort of but not quite fixed. Since posting my
original question I have found information that tells that the old error was
obvious (returned numbers not between 0 and 1), but this one seems to be off
only by a little bit.

-Filter Guy


"Jerry W. Lewis" wrote:




  #8   Report Post  
Jerry W. Lewis
 
Posts: n/a
Default Chart fixed r2 has a bug, use LINEST instead!

You're welcome. Glad it helped.

Jerry

Filtration Guy wrote:

Thanks again for your help with this.


Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
linest won't accept multiple reference ranges Alex Gardner Excel Worksheet Functions 3 May 18th 05 01:20 PM
How to use linest with variably sized data arrays? [email protected] Excel Worksheet Functions 0 April 13th 05 04:56 PM
LINEST bug with cubic polynomials in Excel 2003 byundt Excel Worksheet Functions 3 March 21st 05 02:15 PM
calculating excel spreadsheet files for pensions and life insurance (including age calculation sheets) RICHARD Excel Worksheet Functions 1 March 15th 05 05:49 PM
AGE CALCULATING EXCEL SPREADSHEETS AND OTHERS! RICHARD Excel Discussion (Misc queries) 0 March 1st 05 01:53 PM


All times are GMT +1. The time now is 01:15 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"