Home |
Search |
Today's Posts |
#1
|
|||
|
|||
Correlation Coefficient Issue
I have a formula in cell C7, the formula is:
CORREL(B4:B6,C4:C6) The cells have the follwing values: B4: 1.90 B5: 1.90 B6: 1.90 C4: 2.13 C5: 1.83 C6: 3.02 Cell C7 returns: 0.000 Here's where the problem comes in. The formula in cell C15 is: CORREL(B12:B14,C12:C14) The cells have the follwing values: B12: 9.50 B13: 9.50 B14: 9.50 C12: 9.56 C13: 10.26 C13: 10.16 Cell C15 returns the error: #DIV/0! Why is the formula producing the #DIV/0! error in the second set and not the first or vice versa? ANY help would be greatly appreciated. Thanks, Diane |
#2
|
|||
|
|||
See Help for CORREL worksheet function. The denominator is the product
of standard deviations for x and y. The standard deviation of B12:B14 is zero ... Format B4:B6 to show 14 decimal places. You will find that they are not all equal, hence the standard deviation is small, but not zero. Jerry Diane wrote: I have a formula in cell C7, the formula is: CORREL(B4:B6,C4:C6) The cells have the follwing values: B4: 1.90 B5: 1.90 B6: 1.90 C4: 2.13 C5: 1.83 C6: 3.02 Cell C7 returns: 0.000 Here's where the problem comes in. The formula in cell C15 is: CORREL(B12:B14,C12:C14) The cells have the follwing values: B12: 9.50 B13: 9.50 B14: 9.50 C12: 9.56 C13: 10.26 C13: 10.16 Cell C15 returns the error: #DIV/0! Why is the formula producing the #DIV/0! error in the second set and not the first or vice versa? ANY help would be greatly appreciated. Thanks, Diane |
#3
|
|||
|
|||
If I increase B4:B6 to show 14 decimal places, the values
will only be 1.90000000000000 - Why would that make a difference? -----Original Message----- See Help for CORREL worksheet function. The denominator is the product of standard deviations for x and y. The standard deviation of B12:B14 is zero ... Format B4:B6 to show 14 decimal places. You will find that they are not all equal, hence the standard deviation is small, but not zero. Jerry Diane wrote: I have a formula in cell C7, the formula is: CORREL(B4:B6,C4:C6) The cells have the follwing values: B4: 1.90 B5: 1.90 B6: 1.90 C4: 2.13 C5: 1.83 C6: 3.02 Cell C7 returns: 0.000 Here's where the problem comes in. The formula in cell C15 is: CORREL(B12:B14,C12:C14) The cells have the follwing values: B12: 9.50 B13: 9.50 B14: 9.50 C12: 9.56 C13: 10.26 C13: 10.16 Cell C15 returns the error: #DIV/0! Why is the formula producing the #DIV/0! error in the second set and not the first or vice versa? ANY help would be greatly appreciated. Thanks, Diane . |
#4
|
|||
|
|||
I presume then that B4:B6 are calculated results. Because of decimal/
binary conversions, etc. it is possible to have slight differences between calculated results that cannot be displayed. If you will post (text only, please, no attachments) the formulas and the contents of the referenced cells, I will endeavor to explain why these slight differences are not arithmetic errors. Meanwhile, you will find that at least one of the following formulae are not zero =(B4-B5) =(B4-B6) =(B5-B6) The parentheses in these formulas are required to see the differences when the values are equal to 15 decimal places. In cases such as this, rounding the results in B4:B6 is the only way to get the expected result. Jerry Diane wrote: If I increase B4:B6 to show 14 decimal places, the values will only be 1.90000000000000 - Why would that make a difference? -----Original Message----- See Help for CORREL worksheet function. The denominator is the product of standard deviations for x and y. The standard deviation of B12:B14 is zero ... Format B4:B6 to show 14 decimal places. You will find that they are not all equal, hence the standard deviation is small, but not zero. Jerry Diane wrote: I have a formula in cell C7, the formula is: CORREL(B4:B6,C4:C6) The cells have the follwing values: B4: 1.90 B5: 1.90 B6: 1.90 C4: 2.13 C5: 1.83 C6: 3.02 Cell C7 returns: 0.000 Here's where the problem comes in. The formula in cell C15 is: CORREL(B12:B14,C12:C14) The cells have the follwing values: B12: 9.50 B13: 9.50 B14: 9.50 C12: 9.56 C13: 10.26 C13: 10.16 Cell C15 returns the error: #DIV/0! Why is the formula producing the #DIV/0! error in the second set and not the first or vice versa? ANY help would be greatly appreciated. Thanks, Diane . |
#5
|
|||
|
|||
As a possibly useful preview, enter 1.9 into A15
In A14 enter the formula =A15+2^-51 In A15 enter the formula =A15-2^-51 Copy cell A14 and paste over A1:A13 Copy cell A16 and paste over A17:A30 Format A1:A16 as numeric with 15 decimal places This illustrates that (as documented) Excel will only display 15 digits (14 decimal places with these numbers), yet there are 23 distinct binary numbers that all display as 1.90000000000000 In your case your cell formulas resulted in at least two of these 23 distinct values, hence the resulting nonzero correlation. Jerry Jerry W. Lewis wrote: I presume then that B4:B6 are calculated results. Because of decimal/ binary conversions, etc. it is possible to have slight differences between calculated results that cannot be displayed. If you will post (text only, please, no attachments) the formulas and the contents of the referenced cells, I will endeavor to explain why these slight differences are not arithmetic errors. Meanwhile, you will find that at least one of the following formulae are not zero =(B4-B5) =(B4-B6) =(B5-B6) The parentheses in these formulas are required to see the differences when the values are equal to 15 decimal places. In cases such as this, rounding the results in B4:B6 is the only way to get the expected result. Jerry Diane wrote: If I increase B4:B6 to show 14 decimal places, the values will only be 1.90000000000000 - Why would that make a difference? -----Original Message----- See Help for CORREL worksheet function. The denominator is the product of standard deviations for x and y. The standard deviation of B12:B14 is zero ... Format B4:B6 to show 14 decimal places. You will find that they are not all equal, hence the standard deviation is small, but not zero. Jerry Diane wrote: I have a formula in cell C7, the formula is: CORREL(B4:B6,C4:C6) The cells have the follwing values: B4: 1.90 B5: 1.90 B6: 1.90 C4: 2.13 C5: 1.83 C6: 3.02 Cell C7 returns: 0.000 Here's where the problem comes in. The formula in cell C15 is: CORREL(B12:B14,C12:C14) The cells have the follwing values: B12: 9.50 B13: 9.50 B14: 9.50 C12: 9.56 C13: 10.26 C13: 10.16 Cell C15 returns the error: #DIV/0! Why is the formula producing the #DIV/0! error in the second set and not the first or vice versa? ANY help would be greatly appreciated. Thanks, Diane |
#6
|
|||
|
|||
Thanks for your help - Below are the values:
The following are manual entries: B4: 1.90 B5: 1.90 B6: 1.90 The following are manual entries: C4: 2.13 C5: 1.83 C6: 3.02 Formula in C7: =CORREL(B4:B6,C4:C6) The above returns 0.000 The following are manual entries: B12: 9.50 B13: 9.50 B14: 9.50 The following are manual entries: C12: 9.56 C13: 10.26 C14: 10.16 Formula in C15: =CORREL(B12:B14,C12:C14) The above returns #DIV/0! Thank you again for your help -----Original Message----- I presume then that B4:B6 are calculated results. Because of decimal/ binary conversions, etc. it is possible to have slight differences between calculated results that cannot be displayed. If you will post (text only, please, no attachments) the formulas and the contents of the referenced cells, I will endeavor to explain why these slight differences are not arithmetic errors. Meanwhile, you will find that at least one of the following formulae are not zero =(B4-B5) =(B4-B6) =(B5-B6) The parentheses in these formulas are required to see the differences when the values are equal to 15 decimal places. In cases such as this, rounding the results in B4:B6 is the only way to get the expected result. Jerry Diane wrote: If I increase B4:B6 to show 14 decimal places, the values will only be 1.90000000000000 - Why would that make a difference? -----Original Message----- See Help for CORREL worksheet function. The denominator is the product of standard deviations for x and y. The standard deviation of B12:B14 is zero ... Format B4:B6 to show 14 decimal places. You will find that they are not all equal, hence the standard deviation is small, but not zero. Jerry Diane wrote: I have a formula in cell C7, the formula is: CORREL(B4:B6,C4:C6) The cells have the follwing values: B4: 1.90 B5: 1.90 B6: 1.90 C4: 2.13 C5: 1.83 C6: 3.02 Cell C7 returns: 0.000 Here's where the problem comes in. The formula in cell C15 is: CORREL(B12:B14,C12:C14) The cells have the follwing values: B12: 9.50 B13: 9.50 B14: 9.50 C12: 9.56 C13: 10.26 C13: 10.16 Cell C15 returns the error: #DIV/0! Why is the formula producing the #DIV/0! error in the second set and not the first or vice versa? ANY help would be greatly appreciated. Thanks, Diane . . |
#7
|
|||
|
|||
Diane wrote...
Thanks for your help - Below are the values: The following are manual entries: B4: 1.90 B5: 1.90 B6: 1.90 The following are manual entries: C4: 2.13 C5: 1.83 C6: 3.02 Formula in C7: =CORREL(B4:B6,C4:C6) The above returns 0.000 ... This is an error. FWIW, STDEV(B4:B6) returns 2.98023E-08 in XL97SR-2, but that's wrong because the standard deviation of a set of any number of instances of the same number should be zero. This appears to be nothing more than one of the nastier instances of Excel's poor numerical implementation in older versions. However, in all instances, if B4:B6 has zero variance, so zero standard deviation, then it's correlation with any other set of values is undefined. Putting this another way, there's no such thing as correlation between constants and random variables or between different constants. --------- www.coffeecozy.com Use your Bodum and give up cold coffee for good! |
#8
|
|||
|
|||
Per Help for CORREL and DEVSQ,
=CORREL(B4:B6,C4:C6) is equivalent to =COVAR(B4:B6,C4:C6) / ( SQRT(DEVSQ(B4:B6)/COUNT(B4:B6)) * SQRT(DEVSQ(C4:C6)/COUNT(C4:C6)) ) =DEVSQ(B12:B14) is zero but =DEVSQ(B4:B6) is not zero, hence the difference in CORREL results. =DEVSQ(B4:B6) is not zero because =(B4-AVERAGE(B4:B6)) is not zero. That can happen when all of the values are identical, because computer math is done to finite precision. Imagine that you had a decimal computer that carried only 4 figures in calculations. In that case, SUM(0.4444,0.4444,0.4444) would be 1.333 since 1.3332 would require a 5th figure. Thus AVERAGE(0.4444,0.4444,0.4444) would be 0.4443 instead of 0.4444 and hence (0.4444-AVERAGE(0.4444,0.4444,0.4444)) would not be zero. You are probably wondering how this is applicable, when your repeated number is 1.9 since Excel advertises 15 digit precision. The reason is that Excel (and almost all other general purpose hardware and software) does binary math, not decimal math. Most decimal fractions (including 1.9) is a non-terminating binary fraction, just as 1/3 is a non-terminating decimal fraction. The trailing bit pattern in the binary approximation to 1.9 is such that there is lost information when you add 3 copies of 1.9 (crossing 2 powers of 2) just as in my hypothetical example. If you chase through the details of the IEEE 754 standard, you will find that =(B4-AVERAGE(B4:B6)) should equal 1/4503599627370496 which is 2.220446049250313080847263336181640625E-16. Excel displays that difference as 2.22044604925031E-16, indicating that it is performing the arithmetic correctly. 9.5 is exactly representable in binary (an integer plus a power of 2), hence finite precision arithmetic matches the infinite precision arithmetic that you would mentally do. Jerry Diane wrote: Thanks for your help - Below are the values: The following are manual entries: B4: 1.90 B5: 1.90 B6: 1.90 The following are manual entries: C4: 2.13 C5: 1.83 C6: 3.02 Formula in C7: =CORREL(B4:B6,C4:C6) The above returns 0.000 The following are manual entries: B12: 9.50 B13: 9.50 B14: 9.50 The following are manual entries: C12: 9.56 C13: 10.26 C14: 10.16 Formula in C15: =CORREL(B12:B14,C12:C14) The above returns #DIV/0! Thank you again for your help -----Original Message----- I presume then that B4:B6 are calculated results. Because of decimal/ binary conversions, etc. it is possible to have slight differences between calculated results that cannot be displayed. If you will post (text only, please, no attachments) the formulas and the contents of the referenced cells, I will endeavor to explain why these slight differences are not arithmetic errors. Meanwhile, you will find that at least one of the following formulae are not zero =(B4-B5) =(B4-B6) =(B5-B6) The parentheses in these formulas are required to see the differences when the values are equal to 15 decimal places. In cases such as this, rounding the results in B4:B6 is the only way to get the expected result. Jerry Diane wrote: If I increase B4:B6 to show 14 decimal places, the values will only be 1.90000000000000 - Why would that make a difference? -----Original Message----- See Help for CORREL worksheet function. The denominator is the product of standard deviations for x and y. The standard deviation of B12:B14 is zero ... Format B4:B6 to show 14 decimal places. You will find that they are not all equal, hence the standard deviation is small, but not zero. Jerry Diane wrote: I have a formula in cell C7, the formula is: CORREL(B4:B6,C4:C6) The cells have the follwing values: B4: 1.90 B5: 1.90 B6: 1.90 C4: 2.13 C5: 1.83 C6: 3.02 Cell C7 returns: 0.000 Here's where the problem comes in. The formula in cell C15 is: CORREL(B12:B14,C12:C14) The cells have the follwing values: B12: 9.50 B13: 9.50 B14: 9.50 C12: 9.56 C13: 10.26 C13: 10.16 Cell C15 returns the error: #DIV/0! Why is the formula producing the #DIV/0! error in the second set and not the first or vice versa? ANY help would be greatly appreciated. Thanks, Diane . . |
#9
|
|||
|
|||
CORREL() calculates like DEVSQ(), not like pre-2003 STDEV(). As noted
in another post in this thread, =(B4-AVERAGE(B4:B6)) is not zero. That is why I prefer updating algorithms to two pass alogorithms, http://groups.google.com/groups?selm...0no_e-mail.com which would have no truncation error in either case. Jerry hrlngrv - ExcelForums.com wrote: Diane wrote... Thanks for your help - Below are the values: The following are manual entries: B4: 1.90 B5: 1.90 B6: 1.90 The following are manual entries: C4: 2.13 C5: 1.83 C6: 3.02 Formula in C7: =CORREL(B4:B6,C4:C6) The above returns 0.000 ... This is an error. FWIW, STDEV(B4:B6) returns 2.98023E-08 in XL97SR-2, but that's wrong because the standard deviation of a set of any number of instances of the same number should be zero. This appears to be nothing more than one of the nastier instances of Excel's poor numerical implementation in older versions. However, in all instances, if B4:B6 has zero variance, so zero standard deviation, then it's correlation with any other set of values is undefined. Putting this another way, there's no such thing as correlation between constants and random variables or between different constants. |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Shared File & Freeze panes issue | Excel Discussion (Misc queries) | |||
please help with this issue | New Users to Excel | |||
Issue with copying Text. | Excel Discussion (Misc queries) | |||
VALUE issue | Excel Worksheet Functions | |||
#VALUE! issue | Excel Worksheet Functions |