Home |
Search |
Today's Posts |
#1
|
|||
|
|||
LINEST bug in Excel 2003
I have used LINEST to provide polynomial coefficients in Excel97 through
Excel 2002 with no problems. My company has loaded Excel 2003 and now all the existing spreadsheets give the wrong answers but no error messages. I have read the KB articles on LINEST in 2003 and none give a clue to the problem. I use two columns of data for x and y with in this example a row of 7 cells highlighted with this array formula: =LINEST(AN3:AN11,AM3:AM11^{1,2,3,4,5,6}) Excel 97 -2002 dutifully reports 7 cells with the coeffs of X6,X5,X4,X3,X2,X1 & X0 Excel 2003 sets X^6 to zero regardless of the data set and reports a wildly wrong value for the constant. I have tested this on a clean build of Excel 2003 with the Analysis Tool pack loaded as well as several of our 8000 corporate desktops. I cannot find any reference to this insidious problem on the internet, has anyone else experienced it? |
#2
|
|||
|
|||
Please check out Jerry W. Lewis' posting of 5/6/2005
-- Gary's Student "mathman" wrote: I have used LINEST to provide polynomial coefficients in Excel97 through Excel 2002 with no problems. My company has loaded Excel 2003 and now all the existing spreadsheets give the wrong answers but no error messages. I have read the KB articles on LINEST in 2003 and none give a clue to the problem. I use two columns of data for x and y with in this example a row of 7 cells highlighted with this array formula: =LINEST(AN3:AN11,AM3:AM11^{1,2,3,4,5,6}) Excel 97 -2002 dutifully reports 7 cells with the coeffs of X6,X5,X4,X3,X2,X1 & X0 Excel 2003 sets X^6 to zero regardless of the data set and reports a wildly wrong value for the constant. I have tested this on a clean build of Excel 2003 with the Analysis Tool pack loaded as well as several of our 8000 corporate desktops. I cannot find any reference to this insidious problem on the internet, has anyone else experienced it? |
#3
|
|||
|
|||
mathman -
I think you should also be concerned about the "insidious problem" you may encounter from overfitting the data when using a sixth-order polynomial. - Mike www.mikemiddleton.com "mathman" wrote in message ... I have used LINEST to provide polynomial coefficients in Excel97 through Excel 2002 with no problems. My company has loaded Excel 2003 and now all the existing spreadsheets give the wrong answers but no error messages. I have read the KB articles on LINEST in 2003 and none give a clue to the problem. I use two columns of data for x and y with in this example a row of 7 cells highlighted with this array formula: =LINEST(AN3:AN11,AM3:AM11^{1,2,3,4,5,6}) Excel 97 -2002 dutifully reports 7 cells with the coeffs of X6,X5,X4,X3,X2,X1 & X0 Excel 2003 sets X^6 to zero regardless of the data set and reports a wildly wrong value for the constant. I have tested this on a clean build of Excel 2003 with the Analysis Tool pack loaded as well as several of our 8000 corporate desktops. I cannot find any reference to this insidious problem on the internet, has anyone else experienced it? |
#4
|
|||
|
|||
Thanks for trying to help, I can only find one post from Jerry on 5/6/2005
under the thread "simple statistical analysis" this did not seem relevant but perhaps I missed the point. "Gary's Student" wrote: Please check out Jerry W. Lewis' posting of 5/6/2005 -- Gary's Student "mathman" wrote: I have used LINEST to provide polynomial coefficients in Excel97 through Excel 2002 with no problems. My company has loaded Excel 2003 and now all the existing spreadsheets give the wrong answers but no error messages. I have read the KB articles on LINEST in 2003 and none give a clue to the problem. I use two columns of data for x and y with in this example a row of 7 cells highlighted with this array formula: =LINEST(AN3:AN11,AM3:AM11^{1,2,3,4,5,6}) Excel 97 -2002 dutifully reports 7 cells with the coeffs of X6,X5,X4,X3,X2,X1 & X0 Excel 2003 sets X^6 to zero regardless of the data set and reports a wildly wrong value for the constant. I have tested this on a clean build of Excel 2003 with the Analysis Tool pack loaded as well as several of our 8000 corporate desktops. I cannot find any reference to this insidious problem on the internet, has anyone else experienced it? |
#5
|
|||
|
|||
Thanks for your interest - the real data has many more points the ranges were
just an example. The problem of missing first coeff applies to 3rd order 4th order 5th order etc. I have tried multiple linear fitting and that fails in the same way in Excel 2003 but works perfectly in Excel 97, Excel 2000 & Excel 2002 "Mike Middleton" wrote: mathman - I think you should also be concerned about the "insidious problem" you may encounter from overfitting the data when using a sixth-order polynomial. - Mike www.mikemiddleton.com "mathman" wrote in message ... I have used LINEST to provide polynomial coefficients in Excel97 through Excel 2002 with no problems. My company has loaded Excel 2003 and now all the existing spreadsheets give the wrong answers but no error messages. I have read the KB articles on LINEST in 2003 and none give a clue to the problem. I use two columns of data for x and y with in this example a row of 7 cells highlighted with this array formula: =LINEST(AN3:AN11,AM3:AM11^{1,2,3,4,5,6}) Excel 97 -2002 dutifully reports 7 cells with the coeffs of X6,X5,X4,X3,X2,X1 & X0 Excel 2003 sets X^6 to zero regardless of the data set and reports a wildly wrong value for the constant. I have tested this on a clean build of Excel 2003 with the Analysis Tool pack loaded as well as several of our 8000 corporate desktops. I cannot find any reference to this insidious problem on the internet, has anyone else experienced it? |
#6
|
|||
|
|||
"mathman" wrote...
Thanks for trying to help, I can only find one post from Jerry on 5/6/2005 under the thread "simple statistical analysis" this did not seem relevant but perhaps I missed the point. .... See http://groups-beta.google.com/group/...e=source&hl=en (or http://makeashorterlink.com/?A19821A3B ). |
#7
|
|||
|
|||
Select 7 columns in a row and array enter (Ctrl-Shift-Enter) the
following formula to reproduce the pre 2003 calculation, except that the coefficients will be given in reverse order (intercept, linear, ... 6th power). =TRANSPOSE(MMULT(MINVERSE(MMULT(TRANSPOSE(xCol^{0, 1,2,3,4,5,6}), xCol^{0,1,2,3,4,5,6})), MMULT(TRANSPOSE(xCol^{0,1,2,3,4,5,6}), yCol))) Note the explicit zero power in this formula. In addition to the potential for overfitting the data, I would be concerned about numerical stability of the calculation (the reason that Excel 2003 changed its approach). http://groups-beta.google.com/group/...916351b5101102 Plot the data as an "XY (Scatter)" chart and add a 6th degree polynomial trendline (format to scientific notation with 14 decimal places); if there is an appreciable difference between the graph and formula coefficients, then the graph coefficients are better. David Braden has posted code to directly extract coefficients from the displayed chart trendline equation into cells http://groups.google.com/groups?selm....microsoft.com Jerry mathman wrote: Thanks for your interest - the real data has many more points the ranges were just an example. The problem of missing first coeff applies to 3rd order 4th order 5th order etc. I have tried multiple linear fitting and that fails in the same way in Excel 2003 but works perfectly in Excel 97, Excel 2000 & Excel 2002 "Mike Middleton" wrote: mathman - I think you should also be concerned about the "insidious problem" you may encounter from overfitting the data when using a sixth-order polynomial. - Mike www.mikemiddleton.com "mathman" wrote in message ... I have used LINEST to provide polynomial coefficients in Excel97 through Excel 2002 with no problems. My company has loaded Excel 2003 and now all the existing spreadsheets give the wrong answers but no error messages. I have read the KB articles on LINEST in 2003 and none give a clue to the problem. I use two columns of data for x and y with in this example a row of 7 cells highlighted with this array formula: =LINEST(AN3:AN11,AM3:AM11^{1,2,3,4,5,6}) Excel 97 -2002 dutifully reports 7 cells with the coeffs of X6,X5,X4,X3,X2,X1 & X0 Excel 2003 sets X^6 to zero regardless of the data set and reports a wildly wrong value for the constant. I have tested this on a clean build of Excel 2003 with the Analysis Tool pack loaded as well as several of our 8000 corporate desktops. I cannot find any reference to this insidious problem on the internet, has anyone else experienced it? |
#8
|
|||
|
|||
Jerry W. Lewis wrote...
.... Plot the data as an "XY (Scatter)" chart and add a 6th degree polynomial trendline (format to scientific notation with 14 decimal places); if there is an appreciable difference between the graph and formula coefficients, then the graph coefficients are better. David Braden has posted code to directly extract coefficients from the displayed chart trendline equation into cells .... Meaning that Microsoft is still using different pieces of code for LINEST and chart trend lines? And the older, trend line code is better? If so, it begs the question why Microsoft squandered their money changing the LINEST code rather than just using the trend line code for LINEST. |
#9
|
|||
|
|||
You are correct, LINEST and chart trendline are not doing exactly the
same thing. I don't know why MS reinvented the wheel here, but it is not the first example of redundant Excel functions which are programmed differently. It has been a while since I did the comparisons, but my vague recollections are that: - Chart is slightly better numerically than 2003 LINEST (when no coefficient estimates are exactly zero) - 2003 LINEST (when no coefficient estimates are exactly zero) is comparable to lm() in S-PLUS and R, and thus is far superior to previous versions of LINEST (except for the anomaly under discussion). - 2003 LINEST appears to zero "small" coefficients, much like the difference between =227.82-(227+0.75+0.05+0.02) and =(227.82-(227+0.75+0.05+0.02)). Thankfully MS did not introduce this "improvement" into the chart trendline. - The nonzero coefficients from 2003 LINEST are correct for the full problem, so you might be able (a new [untested] idea) to fit the terms that gave zero coefficients to the residuals from the original 2003 LINEST fit, to restore the arbitrarily zeroed coefficients. Jerry Harlan Grove wrote: Jerry W. Lewis wrote... ... Plot the data as an "XY (Scatter)" chart and add a 6th degree polynomial trendline (format to scientific notation with 14 decimal places); if there is an appreciable difference between the graph and formula coefficients, then the graph coefficients are better. David Braden has posted code to directly extract coefficients from the displayed chart trendline equation into cells ... Meaning that Microsoft is still using different pieces of code for LINEST and chart trend lines? And the older, trend line code is better? If so, it begs the question why Microsoft squandered their money changing the LINEST code rather than just using the trend line code for LINEST. |
#10
|
|||
|
|||
"Jerry W. Lewis" wrote:
It has been a while since I did the comparisons, but my vague recollections are that: .... - The nonzero coefficients from 2003 LINEST are correct for the full problem, so you might be able (a new [untested] idea) to fit the terms that gave zero coefficients to the residuals from the original 2003 LINEST fit, to restore the arbitrarily zeroed coefficients. Limited testing suggests that when there are coefficients that are exactly zero, coefficients OTHER THAN THE ASSUMED INTERCEPT TERM (apparently calculated by subtraction at the center of data) are correct. If you subtract off the fitted cubic and linear terms from the OP's y-data and then fit a quadratic (with intercept) without a linear term, then you get the correct missing coefficients. Jerry |
#11
|
|||
|
|||
Jerry W. Lewis wrote...
.... Limited testing suggests that when there are coefficients that are exactly zero, coefficients OTHER THAN THE ASSUMED INTERCEPT TERM (apparently calculated by subtraction at the center of data) are correct. If you subtract off the fitted cubic and linear terms from the OP's y-data and then fit a quadratic (with intercept) without a linear term, then you get the correct missing coefficients. You've already acknowledged that Microsoft reinvented the wheel here. It's almost funny that they seem to have come up with yet another approach not as good as the one they use in chart trendlines. Or do you mean that XL2003 LINEST is more accurate than chart trendlines for the coefficients it happens to get right on the first pass? |
#12
|
|||
|
|||
My recollection is that the chart is slightly more accurate for very
ill-conditioned polynomials such as http://groups.google.com/groups?selm...enetserver.com I am not party to the code used by the chart polynomial trendline (if I were I would probably be enjoined by NDA from discussing it). It is possible that makes use of the fact that the model is a polynomial, which would be inappropriate for the more general LINEST function. At this point, I would be thrilled to get a LINEST that is not subject to the cancellation problems inherent in forming the normal equations (pre-2003) and does not mistakenly zero part of the solution. Jerry Harlan Grove wrote: Jerry W. Lewis wrote... ... Limited testing suggests that when there are coefficients that are exactly zero, coefficients OTHER THAN THE ASSUMED INTERCEPT TERM (apparently calculated by subtraction at the center of data) are correct. If you subtract off the fitted cubic and linear terms from the OP's y-data and then fit a quadratic (with intercept) without a linear term, then you get the correct missing coefficients. You've already acknowledged that Microsoft reinvented the wheel here. It's almost funny that they seem to have come up with yet another approach not as good as the one they use in chart trendlines. Or do you mean that XL2003 LINEST is more accurate than chart trendlines for the coefficients it happens to get right on the first pass? |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
quattro converter for Excel 2003 | Excel Discussion (Misc queries) | |||
Excel 2003 Opens Up Every File in My Documents | Excel Discussion (Misc queries) | |||
sharing/using/saving Excel 2002 files in Excel 2003 | Excel Discussion (Misc queries) | |||
Excel 2003 back to Excel 2002 | Excel Discussion (Misc queries) | |||
Excel 2000 file when opened in Excel 2003 generates errors? | Excel Discussion (Misc queries) |