Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
mathman
 
Posts: n/a
Default LINEST bug in Excel 2003

I have used LINEST to provide polynomial coefficients in Excel97 through
Excel 2002 with no problems. My company has loaded Excel 2003 and now all the
existing spreadsheets give the wrong answers but no error messages. I have
read the KB articles on LINEST in 2003 and none give a clue to the problem.

I use two columns of data for x and y with in this example a row of 7 cells
highlighted with this array formula:

=LINEST(AN3:AN11,AM3:AM11^{1,2,3,4,5,6})

Excel 97 -2002 dutifully reports 7 cells with the coeffs of
X6,X5,X4,X3,X2,X1 & X0
Excel 2003 sets X^6 to zero regardless of the data set and reports a wildly
wrong value for the constant.

I have tested this on a clean build of Excel 2003 with the Analysis Tool
pack loaded as well as several of our 8000 corporate desktops. I cannot find
any reference to this insidious problem on the internet, has anyone else
experienced it?

  #2   Report Post  
Gary's Student
 
Posts: n/a
Default

Please check out Jerry W. Lewis' posting of 5/6/2005
--
Gary's Student


"mathman" wrote:

I have used LINEST to provide polynomial coefficients in Excel97 through
Excel 2002 with no problems. My company has loaded Excel 2003 and now all the
existing spreadsheets give the wrong answers but no error messages. I have
read the KB articles on LINEST in 2003 and none give a clue to the problem.

I use two columns of data for x and y with in this example a row of 7 cells
highlighted with this array formula:

=LINEST(AN3:AN11,AM3:AM11^{1,2,3,4,5,6})

Excel 97 -2002 dutifully reports 7 cells with the coeffs of
X6,X5,X4,X3,X2,X1 & X0
Excel 2003 sets X^6 to zero regardless of the data set and reports a wildly
wrong value for the constant.

I have tested this on a clean build of Excel 2003 with the Analysis Tool
pack loaded as well as several of our 8000 corporate desktops. I cannot find
any reference to this insidious problem on the internet, has anyone else
experienced it?

  #3   Report Post  
Mike Middleton
 
Posts: n/a
Default

mathman -

I think you should also be concerned about the "insidious problem" you may
encounter from overfitting the data when using a sixth-order polynomial.

- Mike
www.mikemiddleton.com

"mathman" wrote in message
...
I have used LINEST to provide polynomial coefficients in Excel97 through
Excel 2002 with no problems. My company has loaded Excel 2003 and now all
the
existing spreadsheets give the wrong answers but no error messages. I have
read the KB articles on LINEST in 2003 and none give a clue to the
problem.

I use two columns of data for x and y with in this example a row of 7
cells
highlighted with this array formula:

=LINEST(AN3:AN11,AM3:AM11^{1,2,3,4,5,6})

Excel 97 -2002 dutifully reports 7 cells with the coeffs of
X6,X5,X4,X3,X2,X1 & X0
Excel 2003 sets X^6 to zero regardless of the data set and reports a
wildly
wrong value for the constant.

I have tested this on a clean build of Excel 2003 with the Analysis Tool
pack loaded as well as several of our 8000 corporate desktops. I cannot
find
any reference to this insidious problem on the internet, has anyone else
experienced it?



  #4   Report Post  
mathman
 
Posts: n/a
Default

Thanks for trying to help, I can only find one post from Jerry on 5/6/2005
under the thread "simple statistical analysis" this did not seem relevant but
perhaps I missed the point.

"Gary's Student" wrote:

Please check out Jerry W. Lewis' posting of 5/6/2005
--
Gary's Student


"mathman" wrote:

I have used LINEST to provide polynomial coefficients in Excel97 through
Excel 2002 with no problems. My company has loaded Excel 2003 and now all the
existing spreadsheets give the wrong answers but no error messages. I have
read the KB articles on LINEST in 2003 and none give a clue to the problem.

I use two columns of data for x and y with in this example a row of 7 cells
highlighted with this array formula:

=LINEST(AN3:AN11,AM3:AM11^{1,2,3,4,5,6})

Excel 97 -2002 dutifully reports 7 cells with the coeffs of
X6,X5,X4,X3,X2,X1 & X0
Excel 2003 sets X^6 to zero regardless of the data set and reports a wildly
wrong value for the constant.

I have tested this on a clean build of Excel 2003 with the Analysis Tool
pack loaded as well as several of our 8000 corporate desktops. I cannot find
any reference to this insidious problem on the internet, has anyone else
experienced it?

  #5   Report Post  
mathman
 
Posts: n/a
Default

Thanks for your interest - the real data has many more points the ranges were
just an example. The problem of missing first coeff applies to 3rd order 4th
order 5th order etc.

I have tried multiple linear fitting and that fails in the same way in Excel
2003 but works perfectly in Excel 97, Excel 2000 & Excel 2002

"Mike Middleton" wrote:

mathman -

I think you should also be concerned about the "insidious problem" you may
encounter from overfitting the data when using a sixth-order polynomial.

- Mike
www.mikemiddleton.com

"mathman" wrote in message
...
I have used LINEST to provide polynomial coefficients in Excel97 through
Excel 2002 with no problems. My company has loaded Excel 2003 and now all
the
existing spreadsheets give the wrong answers but no error messages. I have
read the KB articles on LINEST in 2003 and none give a clue to the
problem.

I use two columns of data for x and y with in this example a row of 7
cells
highlighted with this array formula:

=LINEST(AN3:AN11,AM3:AM11^{1,2,3,4,5,6})

Excel 97 -2002 dutifully reports 7 cells with the coeffs of
X6,X5,X4,X3,X2,X1 & X0
Excel 2003 sets X^6 to zero regardless of the data set and reports a
wildly
wrong value for the constant.

I have tested this on a clean build of Excel 2003 with the Analysis Tool
pack loaded as well as several of our 8000 corporate desktops. I cannot
find
any reference to this insidious problem on the internet, has anyone else
experienced it?






  #6   Report Post  
Harlan Grove
 
Posts: n/a
Default

"mathman" wrote...
Thanks for trying to help, I can only find one post from Jerry on 5/6/2005
under the thread "simple statistical analysis" this did not seem relevant
but perhaps I missed the point.

....

See

http://groups-beta.google.com/group/...e=source&hl=en

(or http://makeashorterlink.com/?A19821A3B ).


  #7   Report Post  
Jerry W. Lewis
 
Posts: n/a
Default

Select 7 columns in a row and array enter (Ctrl-Shift-Enter) the
following formula to reproduce the pre 2003 calculation, except that the
coefficients will be given in reverse order (intercept, linear, ... 6th
power).

=TRANSPOSE(MMULT(MINVERSE(MMULT(TRANSPOSE(xCol^{0, 1,2,3,4,5,6}),
xCol^{0,1,2,3,4,5,6})), MMULT(TRANSPOSE(xCol^{0,1,2,3,4,5,6}), yCol)))

Note the explicit zero power in this formula.

In addition to the potential for overfitting the data, I would be
concerned about numerical stability of the calculation (the reason that
Excel 2003 changed its approach).

http://groups-beta.google.com/group/...916351b5101102

Plot the data as an "XY (Scatter)" chart and add a 6th degree polynomial
trendline (format to scientific notation with 14 decimal places); if
there is an appreciable difference between the graph and formula
coefficients, then the graph coefficients are better.

David Braden has posted code to directly extract coefficients from the
displayed chart trendline equation into cells

http://groups.google.com/groups?selm....microsoft.com

Jerry

mathman wrote:

Thanks for your interest - the real data has many more points the ranges were
just an example. The problem of missing first coeff applies to 3rd order 4th
order 5th order etc.

I have tried multiple linear fitting and that fails in the same way in Excel
2003 but works perfectly in Excel 97, Excel 2000 & Excel 2002

"Mike Middleton" wrote:


mathman -

I think you should also be concerned about the "insidious problem" you may
encounter from overfitting the data when using a sixth-order polynomial.

- Mike
www.mikemiddleton.com

"mathman" wrote in message
...

I have used LINEST to provide polynomial coefficients in Excel97 through
Excel 2002 with no problems. My company has loaded Excel 2003 and now all
the
existing spreadsheets give the wrong answers but no error messages. I have
read the KB articles on LINEST in 2003 and none give a clue to the
problem.

I use two columns of data for x and y with in this example a row of 7
cells
highlighted with this array formula:

=LINEST(AN3:AN11,AM3:AM11^{1,2,3,4,5,6})

Excel 97 -2002 dutifully reports 7 cells with the coeffs of
X6,X5,X4,X3,X2,X1 & X0
Excel 2003 sets X^6 to zero regardless of the data set and reports a
wildly
wrong value for the constant.

I have tested this on a clean build of Excel 2003 with the Analysis Tool
pack loaded as well as several of our 8000 corporate desktops. I cannot
find
any reference to this insidious problem on the internet, has anyone else
experienced it?


  #8   Report Post  
Harlan Grove
 
Posts: n/a
Default

Jerry W. Lewis wrote...
....
Plot the data as an "XY (Scatter)" chart and add a 6th degree polynomial
trendline (format to scientific notation with 14 decimal places); if
there is an appreciable difference between the graph and formula
coefficients, then the graph coefficients are better.

David Braden has posted code to directly extract coefficients from the
displayed chart trendline equation into cells

....

Meaning that Microsoft is still using different pieces of code for
LINEST and chart trend lines? And the older, trend line code is
better? If so, it begs the question why Microsoft squandered their
money changing the LINEST code rather than just using the trend
line code for LINEST.

  #9   Report Post  
Jerry W. Lewis
 
Posts: n/a
Default

You are correct, LINEST and chart trendline are not doing exactly the
same thing. I don't know why MS reinvented the wheel here, but it is
not the first example of redundant Excel functions which are programmed
differently.

It has been a while since I did the comparisons, but my vague
recollections are that:

- Chart is slightly better numerically than 2003 LINEST (when no
coefficient estimates are exactly zero)

- 2003 LINEST (when no coefficient estimates are exactly zero) is
comparable to lm() in S-PLUS and R, and thus is far superior to previous
versions of LINEST (except for the anomaly under discussion).

- 2003 LINEST appears to zero "small" coefficients, much like the
difference between =227.82-(227+0.75+0.05+0.02) and
=(227.82-(227+0.75+0.05+0.02)). Thankfully MS did not introduce this
"improvement" into the chart trendline.

- The nonzero coefficients from 2003 LINEST are correct for the full
problem, so you might be able (a new [untested] idea) to fit the terms
that gave zero coefficients to the residuals from the original 2003
LINEST fit, to restore the arbitrarily zeroed coefficients.

Jerry

Harlan Grove wrote:

Jerry W. Lewis wrote...
...

Plot the data as an "XY (Scatter)" chart and add a 6th degree polynomial
trendline (format to scientific notation with 14 decimal places); if
there is an appreciable difference between the graph and formula
coefficients, then the graph coefficients are better.

David Braden has posted code to directly extract coefficients from the
displayed chart trendline equation into cells

...

Meaning that Microsoft is still using different pieces of code for
LINEST and chart trend lines? And the older, trend line code is
better? If so, it begs the question why Microsoft squandered their
money changing the LINEST code rather than just using the trend
line code for LINEST.


  #10   Report Post  
Jerry W. Lewis
 
Posts: n/a
Default

"Jerry W. Lewis" wrote:

It has been a while since I did the comparisons, but my vague
recollections are that:

....
- The nonzero coefficients from 2003 LINEST are correct for the full
problem, so you might be able (a new [untested] idea) to fit the terms
that gave zero coefficients to the residuals from the original 2003
LINEST fit, to restore the arbitrarily zeroed coefficients.


Limited testing suggests that when there are coefficients that are exactly
zero, coefficients OTHER THAN THE ASSUMED INTERCEPT TERM (apparently
calculated by subtraction at the center of data) are correct. If you
subtract off the fitted cubic and linear terms from the OP's y-data and then
fit a quadratic (with intercept) without a linear term, then you get the
correct missing coefficients.

Jerry



  #11   Report Post  
Harlan Grove
 
Posts: n/a
Default

Jerry W. Lewis wrote...
....
Limited testing suggests that when there are coefficients that are exactly
zero, coefficients OTHER THAN THE ASSUMED INTERCEPT TERM (apparently
calculated by subtraction at the center of data) are correct. If you
subtract off the fitted cubic and linear terms from the OP's y-data and then
fit a quadratic (with intercept) without a linear term, then you get the
correct missing coefficients.


You've already acknowledged that Microsoft reinvented the wheel here.
It's almost funny that they seem to have come up with yet another
approach not as good as the one they use in chart trendlines. Or do you
mean that XL2003 LINEST is more accurate than chart trendlines for the
coefficients it happens to get right on the first pass?

  #12   Report Post  
Jerry W. Lewis
 
Posts: n/a
Default

My recollection is that the chart is slightly more accurate for very
ill-conditioned polynomials such as
http://groups.google.com/groups?selm...enetserver.com

I am not party to the code used by the chart polynomial trendline (if I
were I would probably be enjoined by NDA from discussing it). It is
possible that makes use of the fact that the model is a polynomial,
which would be inappropriate for the more general LINEST function. At
this point, I would be thrilled to get a LINEST that is not subject to
the cancellation problems inherent in forming the normal equations
(pre-2003) and does not mistakenly zero part of the solution.

Jerry

Harlan Grove wrote:

Jerry W. Lewis wrote...
...

Limited testing suggests that when there are coefficients that are exactly
zero, coefficients OTHER THAN THE ASSUMED INTERCEPT TERM (apparently
calculated by subtraction at the center of data) are correct. If you
subtract off the fitted cubic and linear terms from the OP's y-data and then
fit a quadratic (with intercept) without a linear term, then you get the
correct missing coefficients.


You've already acknowledged that Microsoft reinvented the wheel here.
It's almost funny that they seem to have come up with yet another
approach not as good as the one they use in chart trendlines. Or do you
mean that XL2003 LINEST is more accurate than chart trendlines for the
coefficients it happens to get right on the first pass?


Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
quattro converter for Excel 2003 cntaylor Excel Discussion (Misc queries) 4 April 30th 07 07:08 PM
Excel 2003 Opens Up Every File in My Documents Glenn Reschke Excel Discussion (Misc queries) 4 February 11th 05 06:00 PM
sharing/using/saving Excel 2002 files in Excel 2003 maze2009 Excel Discussion (Misc queries) 0 January 20th 05 07:27 PM
Excel 2003 back to Excel 2002 Pete Carr Excel Discussion (Misc queries) 3 December 27th 04 08:11 AM
Excel 2000 file when opened in Excel 2003 generates errors? Doug Excel Discussion (Misc queries) 13 December 25th 04 10:20 PM


All times are GMT +1. The time now is 10:06 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"