Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.worksheet.functions
David K
 
Posts: n/a
Default Rounding error in Stdev function result.

the function
=STDEV(1.4434,1.4434,1.4434)
gives
2.98023E-08
(at least on my computer using Excel 2002, sp3)

Is this just rounding error due to IEEE double precision. It seems pretty
large.

-David
  #2   Report Post  
Posted to microsoft.public.excel.worksheet.functions
JE McGimpsey
 
Posts: n/a
Default Rounding error in Stdev function result.

It is pretty large, but it's not simply due to rounding...

XL04:

=STDEV(1.4434,1.4434,1.4434) === 2.71947991102104E-16

(I don't have XL03 available right now, but I suspect that the result
would be the same).

The improvement was undoubtedly part of the overhaul of stats functions
for XL03/04:

http://support.microsoft.com/kb/828888/en-us


In article ,
"David K" <David wrote:

the function
=STDEV(1.4434,1.4434,1.4434)
gives
2.98023E-08
(at least on my computer using Excel 2002, sp3)

Is this just rounding error due to IEEE double precision. It seems pretty
large.

  #3   Report Post  
Posted to microsoft.public.excel.worksheet.functions
 
Posts: n/a
Default Rounding error in Stdev function result.

"David K" wrote:
the function =STDEV(1.4434,1.4434,1.4434) gives
2.98023E-08 (at least on my computer using Excel
2002, sp3)
Is this just rounding error due to IEEE double precision.


Yes. But the numerical error caused by the binary
representation is exacerbated by the variance formula
apparently used in Excel 2002 (sp3), which seems to
be different from Excel 2003.

Mathematically, the sample variance can be computed
using either the following equivalent formulas:

1. var = SUM((x - mean)^2, for all x) / (n - 1)

2. var = (SUM(x^2, for all x) - SUM(x, for all x)^2 / n) / (n - 1)

In both cases, std dev = SQRT(var).

When we compute var manually using #2, where
x1 = x2 = x3 = 1.4434 (n = 3), we get:

var = 8.88178419700125E-16
sd = 2.98023223876953E-08

That matches your results. But when we use #1,
we get:

var = 7.39557098644699E-32
sd = 2.71947991102104E-16

That matches the STDEV(1.4434,1.4434,1.4434)
result using Excel 2003.

Note that in both cases, VAR() and STDEV() are not
zero, as you might have expected, even though
AVERAGE(1.4434,1.4434,1.4434) displays 1.44340000000000E+00.
I believe the displayed result of AVERAGE() belies the
fact that there are non-zero bits in the remaining 2-3
binary bits that Excel must round in order to display a
decimal result. I believe that is evidenced by the fact
that (1.4434 - AVERAGE(...))^2 yields 7.39557098644699E-32.

(But I am mystified by the fact that this numerical error
is not evident when I program #1 in VBA, which does
display var = 0 -- a pleasant surprise. Without access
to the VBA binary representations, VBA compiled code
and internal Excel algorithms, I can only speculate wildly
about the disparity.)
  #4   Report Post  
Posted to microsoft.public.excel.worksheet.functions
 
Posts: n/a
Default Rounding error in Stdev function result.

I wrote:
I am mystified by the fact that this numerical error
is not evident when I program #1 in VBA, which does
display var = 0 -- a pleasant surprise. Without access
to the VBA binary representations, VBA compiled code
and internal Excel algorithms, I can only speculate wildly
about the disparity.


My suspicion was confirmed by a response by Martin
Brown to my inquiry in excel.programming and by some
more experimentation on my part. Apparently, even the
retooled Excel 2003 STDEV() implementation does not
take full advantage of the floating point coprocessor, as
the VBA compiler seems to. I'm surprised.
  #5   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Harlan Grove
 
Posts: n/a
Default Rounding error in Stdev function result.

wrote...
I wrote:
I am mystified by the fact that this numerical error
is not evident when I program #1 in VBA, which does
display var = 0 -- a pleasant surprise. Without access
to the VBA binary representations, VBA compiled code
and internal Excel algorithms, I can only speculate wildly
about the disparity.


My suspicion was confirmed by a response by Martin
Brown to my inquiry in excel.programming and by some
more experimentation on my part. Apparently, even the
retooled Excel 2003 STDEV() implementation does not
take full advantage of the floating point coprocessor, as
the VBA compiler seems to. I'm surprised.


It's not the STDEV function, per se, it's the mean. The mean of x, x
and x isn't exactly x. That is,

=SUMSQ(1.4434-AVERAGE(1.4434,1.4434,1.4434),
1.4434-AVERAGE(1.4434,1.4434,1.4434),
1.4434-AVERAGE(1.4434,1.4434,1.4434))

isn't zero. This even defeats DEVSQ, which is usually pretty good, but

=DEVSQ(1.4434,1.4434,1.4434)

returns 1.47911E-31 rather than zero. More to the point, while the
formula

=1.4434-AVERAGE(1.4434,1.4434,1.4434)

returns zero, the formula

=(1.4434-AVERAGE(1.4434,1.4434,1.4434))

returns 2.22045E-16.

Ain't Excel fun!

As with all floating point calculations, use explicit rounding for
final results. In this case,

=ROUND(STDEV(1.4434,1.4434,1.4434),5)

where the 5 is the maximum significant digits in STDEV's arguments.



  #6   Report Post  
Posted to microsoft.public.excel.worksheet.functions
 
Posts: n/a
Default Rounding error in Stdev function result.

"Harlan Grove" wrote:
It's not the STDEV function, per se, it's the mean.


I understand. But the implementation of STDEV() could
easily compute the mean internally, taking advantage of
the 80-bit FP registers. It does not. Nor does AVERAGE(),
as you point out. But the OP asked about STDEV(), not
AVERAGE(). I tailored my comments to the OP's context.

  #7   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Harlan Grove
 
Posts: n/a
Default Rounding error in Stdev function result.

wrote...
....
I understand. But the implementation of STDEV() could
easily compute the mean internally, taking advantage of
the 80-bit FP registers. It does not. Nor does AVERAGE(),
as you point out. But the OP asked about STDEV(), not
AVERAGE(). I tailored my comments to the OP's context.


80 bits wouldn't necessarily help in this case. If the mean is
calculated by summing all the values first then dividing by the count,
the sum itself could cause problems that dividing by the count wouldn't
necessarily solve. In an ideal world, cleverness would also be applied
to calculating the mean.

k = 0
n = 0
For Each v In d
n = n + 1
If n = 1 Then m = v Else m = m * (k / n) + v / n
k = n
Next v

Enter 1.4434 in A1:A3. Enter 1 in B1, 0 in C1 and =A1 in D1. Enter
=B1+1 in B2, =B1 in C2, =D1*(C2/B2)+A2/B2 in D2. Select B2:D2 and fill
down into B3:D3. Cell D3 contains the mean. Enter =AVERAGE(A1:A3) in
D4, and =(D3-D4) in D5. D5 evaluates to -2.22045E-16.

Enter =SUMSQ(A1-D4,A2-D4,A3-D4) in D6. It evaluates to 1.47911E-31. Now
enter =SUMSQ(A1-D3,A2-D3,A3-D3) in D7. It evaluates to 0.

Imagine that! Rounding error elimiated without using 80-bit FPU
registers!

Further, even calculating the mean as =(1.4434/3+1.4434/3+1.4434/3)
would eliminate the rounding error. Try
=SUMPRODUCT((A1:A3-SUMPRODUCT(A1:A3/3))^2).

The error is due *EXCLUSIVELY* to rounding error in the calculation of
the mean; however, it's possible to eliminate the rounding error
without using more bits of precision. For completeness, it's necessary
to understand that the order of operations is *ESSENTIAL* in floating
point arithmetic. While floating point addition and multiplication are
commutative, neither is associative, and the distributive law doesn't
hold (that is, all the time; there are fortuitous exceptions). In
floating point arithmetic it's usually best to perform multiplication
first, then addition if accuracy is more important than execution speed.

  #8   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Dana DeLouis
 
Posts: n/a
Default Rounding error in Stdev function result.

... But the implementation of STDEV() could
easily compute the mean internally, taking advantage of
the 80-bit FP registers. It does not.


I think Harlan is correct. For an alternative opinion, Math programs that
DO take advantage of the FP processor show that the average of:
(1.4434 + 1.4434 + 1.4434)/3
is:
1.4433999999999998

We just can not see the last two digits with Excel.
It shows that an Excel 2003 Worksheet is doing it correctly. Vba rounded
differently. An argument could be made either way if this is good or bad.

StandardDeviation[{1.4434, 1.4434, 1.4434}]
Returns:
2.7194799110210365*^-16

Which is the same as Excel 2003.
It appears to me that an Excel 2003 worksheet is doing it correctly at the
math coprocessor level.
--
Dana DeLouis
Win XP & Office 2003

" wrote
in message ...
"Harlan Grove" wrote:
It's not the STDEV function, per se, it's the mean.


I understand. But the implementation of STDEV() could
easily compute the mean internally, taking advantage of
the 80-bit FP registers. It does not. Nor does AVERAGE(),
as you point out. But the OP asked about STDEV(), not
AVERAGE(). I tailored my comments to the OP's context.



  #9   Report Post  
Posted to microsoft.public.excel.worksheet.functions
 
Posts: n/a
Default Rounding error in Stdev function result.

"Harlan Grove" wrote:
80 bits wouldn't necessarily help in this case.


I agree that 80 bits of precision instead of 52 bits does not
always lead to "exact" results. I did not intend to imply that.

But the fact is: it does in this case (viz., average of
1.4434 three times). Or at least, that was the explanation
give, and I accepted ti. See my posting in excel.programming
for details. This discussion should really be part of that
thread, not this one. See "Why do VBA and Excel floating
point results differ?".

The error is due *EXCLUSIVELY* to rounding error in
the calculation of the mean


Yes. And having (54%) more bits of precision will always
diminish those effects -- at least until the (intermediate)
result is stored into lower-precision variables.

For completeness, it's necessary to understand that the
order of operations is *ESSENTIAL* in floating point
arithmetic.


Which is why I initially said, in this thread, that anything I
would say about the explanation would be wild speculation
without my knowing the internal implementations of Excel
and VBA. I should have left well enough alone -- in this
thread.

If you have that kind of insight, I would appreciate it if you
would contribute to the thread in excel.programming.

  #10   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Jerry W. Lewis
 
Posts: n/a
Default Rounding error in Stdev function result.

XL2003 also returns 2.71947991102104E-16, but per
http://support.microsoft.com/kb/826349/
XL2003 was the first version that changed the STDEV calculation to

=SQRT(SUMSQ(1.4434-AVERAGE(1.4434,1.4434,1.4434),1.4434-AVERAGE(1.4434,1.4434,1.4434),1.4434-AVERAGE(1.4434,1.4434,1.4434))/2)

Prior to XL2003, STDEV was calculated as

=SQRT((SUMSQ(1.4434,1.4434,1.4434)-SUM(1.4434,1.4434,1.4434)^2/3)/2)

which returns the value that the OP reported.

These two formulas are mathematically but not numerically equivalent.

A third approach would use a one-pass updating algorithm

http://groups.google.com/group/micro...6ee0c636ad016a

which would correctly return zero for this standard deviation, even though
1.4434 cannot be exactly represented in binary.

Jerry

"JE McGimpsey" wrote:

It is pretty large, but it's not simply due to rounding...

XL04:

=STDEV(1.4434,1.4434,1.4434) === 2.71947991102104E-16

(I don't have XL03 available right now, but I suspect that the result
would be the same).

The improvement was undoubtedly part of the overhaul of stats functions
for XL03/04:

http://support.microsoft.com/kb/828888/en-us


In article ,
"David K" <David wrote:

the function
=STDEV(1.4434,1.4434,1.4434)
gives
2.98023E-08
(at least on my computer using Excel 2002, sp3)

Is this just rounding error due to IEEE double precision. It seems pretty
large.




  #11   Report Post  
Posted to microsoft.public.excel.worksheet.functions
 
Posts: n/a
Default Rounding error in Stdev function result.

I wrote:
having (54%) more bits of precision will always
diminish those effects -- at least until the (intermediate)
result is stored into lower-precision variables.


Well, that's not correct. First, the mantissa is only 23%
more bits of precision (64 bits). Second, I forget that the
original 52-bit mantissa is effectively simply zero-extended.
So the 80-bit FP result is not necessarily better than the
64-bit result. But I don't believe it will ever be worse.

  #12   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Jerry W. Lewis
 
Posts: n/a
Default Rounding error in Stdev function result.

80-bit reals have 64-bit mantissas vs. effectively 53-bit mantissas in 64-bit
reals, so there would be some improvement, but not as much as from using a
better algorithm. VBA offers very limited control over register arithmetic.
The following code will use register arithmetic within the Abs()

Function SD(x As Double) As Double
SD = Sqr(Abs((x * x + x * x + x * x - (x + x + x) * (x + x + x) / 3) / 2))
End Function

but SD(1.4434) returns 4.65661287307739E-10; better than XL2002's
2.98023223876953E-08, but nowhere near as good as XL2003's
2.71947991102104E-16 or the exact 0 that an updating algorithm would give

http://groups.google.com/group/micro...6ee0c636ad016a

Jerry

" wrote:

I wrote:
having (54%) more bits of precision will always
diminish those effects -- at least until the (intermediate)
result is stored into lower-precision variables.


Well, that's not correct. First, the mantissa is only 23%
more bits of precision (64 bits). Second, I forget that the
original 52-bit mantissa is effectively simply zero-extended.
So the 80-bit FP result is not necessarily better than the
64-bit result. But I don't believe it will ever be worse.

  #13   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Jerry W. Lewis
 
Posts: n/a
Default Rounding error in Stdev function result.

1.4433999999999998 is the result of calculating the average using 64-bit
reals. You can display additional figures as
=D2D(AVERAGE(1.4434,1.4434,1.4434)) using my D2D function from
http://groups.google.com/group/micro...06871cf92f8465

64-bit reals can exactly represent all 15 digit integers, but require 17
digits to uniquely characterize a floating point number.

80-bit reals can exactly represent all 19 digit integers

The 80-byte representation of 1.4434 and of the average of three of them are
both equal to 1.44339999999999999997E0 (21 digits required to distinguish
from 1.4434), so the XL2003 algorithm would return 0 if it used 80-byte
calculations.

In this case, you could even use intermediate rounding to 64-bit reals and
still get 0 with a partial use of register precision, as the following VBA
code shows

Function DSq(Optional x As Double = 1.4434) As Double
Dim ave As Double
' each line is calculated in 80-bit register, then stored in 64-bit variable
ave = (x + x + x) / 3
DSq = (x - ave) * (x - ave) + (x - ave) * (x - ave) + (x - ave) * (x -
ave)
End Function

Jerry

"Dana DeLouis" wrote:

... But the implementation of STDEV() could
easily compute the mean internally, taking advantage of
the 80-bit FP registers. It does not.


I think Harlan is correct. For an alternative opinion, Math programs that
DO take advantage of the FP processor show that the average of:
(1.4434 + 1.4434 + 1.4434)/3
is:
1.4433999999999998

We just can not see the last two digits with Excel.
It shows that an Excel 2003 Worksheet is doing it correctly. Vba rounded
differently. An argument could be made either way if this is good or bad.

StandardDeviation[{1.4434, 1.4434, 1.4434}]
Returns:
2.7194799110210365*^-16

Which is the same as Excel 2003.
It appears to me that an Excel 2003 worksheet is doing it correctly at the
math coprocessor level.
--
Dana DeLouis
Win XP & Office 2003

" wrote
in message ...
"Harlan Grove" wrote:
It's not the STDEV function, per se, it's the mean.


I understand. But the implementation of STDEV() could
easily compute the mean internally, taking advantage of
the 80-bit FP registers. It does not. Nor does AVERAGE(),
as you point out. But the OP asked about STDEV(), not
AVERAGE(). I tailored my comments to the OP's context.




  #14   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Dana DeLouis
 
Posts: n/a
Default Rounding error in Stdev function result.

Hi. Thanks Jerry. As a side note to the op, if you just need the STDEV of
3 cells, it "appears" that an equation will returns zero (0) more often if
all 3 cells are equal.
Perhaps use this:
=SQRT((A1^2+B1^2-B1*C1+C1^2-A1*(B1+C1))/3)
instead of this:
=STDEV(A1:C1)

--
Dana DeLouis
Win XP & Office 2003


"Jerry W. Lewis" wrote in message
...
1.4433999999999998 is the result of calculating the average using 64-bit
reals. You can display additional figures as
=D2D(AVERAGE(1.4434,1.4434,1.4434)) using my D2D function from
http://groups.google.com/group/micro...06871cf92f8465

64-bit reals can exactly represent all 15 digit integers, but require 17
digits to uniquely characterize a floating point number.

80-bit reals can exactly represent all 19 digit integers

The 80-byte representation of 1.4434 and of the average of three of them
are
both equal to 1.44339999999999999997E0 (21 digits required to distinguish
from 1.4434), so the XL2003 algorithm would return 0 if it used 80-byte
calculations.

In this case, you could even use intermediate rounding to 64-bit reals and
still get 0 with a partial use of register precision, as the following VBA
code shows

Function DSq(Optional x As Double = 1.4434) As Double
Dim ave As Double
' each line is calculated in 80-bit register, then stored in 64-bit
variable
ave = (x + x + x) / 3
DSq = (x - ave) * (x - ave) + (x - ave) * (x - ave) + (x - ave) * (x -
ave)
End Function

Jerry

"Dana DeLouis" wrote:

... But the implementation of STDEV() could
easily compute the mean internally, taking advantage of
the 80-bit FP registers. It does not.


I think Harlan is correct. For an alternative opinion, Math programs
that
DO take advantage of the FP processor show that the average of:
(1.4434 + 1.4434 + 1.4434)/3
is:
1.4433999999999998

We just can not see the last two digits with Excel.
It shows that an Excel 2003 Worksheet is doing it correctly. Vba rounded
differently. An argument could be made either way if this is good or
bad.

StandardDeviation[{1.4434, 1.4434, 1.4434}]
Returns:
2.7194799110210365*^-16

Which is the same as Excel 2003.
It appears to me that an Excel 2003 worksheet is doing it correctly at
the
math coprocessor level.
--
Dana DeLouis
Win XP & Office 2003

"
wrote
in message ...
"Harlan Grove" wrote:
It's not the STDEV function, per se, it's the mean.

I understand. But the implementation of STDEV() could
easily compute the mean internally, taking advantage of
the 80-bit FP registers. It does not. Nor does AVERAGE(),
as you point out. But the OP asked about STDEV(), not
AVERAGE(). I tailored my comments to the OP's context.






  #16   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Jerry W. Lewis
 
Posts: n/a
Default Rounding error in Stdev function result.

Your alternat formula does seem to be a little better numerically than
pre-2003 STDEV; but it is still squaring terms before subtracting, so it will
be nowhere near as numerically stable as =SQRT(DEVSQ(A1:C1)/2), which is used
by xl2003.

Jerry

"Dana DeLouis" wrote:

Hi. Thanks Jerry. As a side note to the op, if you just need the STDEV of
3 cells, it "appears" that an equation will returns zero (0) more often if
all 3 cells are equal.
Perhaps use this:
=SQRT((A1^2+B1^2-B1*C1+C1^2-A1*(B1+C1))/3)
instead of this:
=STDEV(A1:C1)

--
Dana DeLouis
Win XP & Office 2003


"Jerry W. Lewis" wrote in message
...
1.4433999999999998 is the result of calculating the average using 64-bit
reals. You can display additional figures as
=D2D(AVERAGE(1.4434,1.4434,1.4434)) using my D2D function from
http://groups.google.com/group/micro...06871cf92f8465

64-bit reals can exactly represent all 15 digit integers, but require 17
digits to uniquely characterize a floating point number.

80-bit reals can exactly represent all 19 digit integers

The 80-byte representation of 1.4434 and of the average of three of them
are
both equal to 1.44339999999999999997E0 (21 digits required to distinguish
from 1.4434), so the XL2003 algorithm would return 0 if it used 80-byte
calculations.

In this case, you could even use intermediate rounding to 64-bit reals and
still get 0 with a partial use of register precision, as the following VBA
code shows

Function DSq(Optional x As Double = 1.4434) As Double
Dim ave As Double
' each line is calculated in 80-bit register, then stored in 64-bit
variable
ave = (x + x + x) / 3
DSq = (x - ave) * (x - ave) + (x - ave) * (x - ave) + (x - ave) * (x -
ave)
End Function

Jerry

"Dana DeLouis" wrote:

... But the implementation of STDEV() could
easily compute the mean internally, taking advantage of
the 80-bit FP registers. It does not.

I think Harlan is correct. For an alternative opinion, Math programs
that
DO take advantage of the FP processor show that the average of:
(1.4434 + 1.4434 + 1.4434)/3
is:
1.4433999999999998

We just can not see the last two digits with Excel.
It shows that an Excel 2003 Worksheet is doing it correctly. Vba rounded
differently. An argument could be made either way if this is good or
bad.

StandardDeviation[{1.4434, 1.4434, 1.4434}]
Returns:
2.7194799110210365*^-16

Which is the same as Excel 2003.
It appears to me that an Excel 2003 worksheet is doing it correctly at
the
math coprocessor level.
--
Dana DeLouis
Win XP & Office 2003

"
wrote
in message ...
"Harlan Grove" wrote:
It's not the STDEV function, per se, it's the mean.

I understand. But the implementation of STDEV() could
easily compute the mean internally, taking advantage of
the 80-bit FP registers. It does not. Nor does AVERAGE(),
as you point out. But the OP asked about STDEV(), not
AVERAGE(). I tailored my comments to the OP's context.







Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Rounding up the result of an IF function donnaK Excel Worksheet Functions 3 December 8th 05 10:15 PM
RSQ function gives incorrect result imurch Excel Worksheet Functions 1 November 4th 05 01:35 PM
convert numbers to text bellman Excel Discussion (Misc queries) 0 October 4th 05 10:28 PM
excel melissa Excel Worksheet Functions 2 August 19th 05 09:46 PM
Amount or Numbers in Words ron New Users to Excel 6 December 24th 04 07:32 PM


All times are GMT +1. The time now is 12:06 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"