Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
A Serious Round-Off Problem !!
Hello;
The following two sample codes produce two completely different results !! SUB TEST1() ' ... my code... SUM = 0.0 FOR I = 1 TO 20000 SUM = SUM + A(I)*B(I)/C(I) NEXT RESULT1 = SUM ' ...my code... END SUB SUB TEST2() ' ... my code... SUM = 0.0 FOR I = 1 TO 20000 SUM = SUM + A(I)*B(I)/C(I) /1000. NEXT RESULT2 = SUM*1000. ' ...my code END SUB Variables: Result1, Result2, SUM, A,B,C are DOUBLE data types Values of A,B,C vary from very small to very large, in no specific order For a typical set of analytical data, Result1 = 0.9986, while Result2 = 4.2432, a considerable difference !! The different results are most likely associated with accumilated round-off errors known to be associated with floating-point arithmetic precission. I would very much appreciate your suggestion(s) on how to minimize this serious round-off error. Can one use a higher precission than DOUBLE ... either in VBA or FORTRAN ?? Thank you kindly. |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
A Serious Round-Off Problem !!
First, I'd change the name of SUM to sumthing else. Not saying that using
the name of a Worksheet function is affecting results, but why take a chance? Maybe use myTotal or myInterimResult See what your current SUM value is before the multiplication in Test2() - put a breakpoint at the RESULT2=SUM*1000 statement or a STOP command just ahead of it. Use the immediate window to Print SUM and see whether or not that value is the same as calculated in TEST1 (capturing value of SUM at same point in that code via same method). Since you indicate more code follows this, want to make sure that nothing is affecting the contents of RESULT1 or RESULT2 later on. "monir" wrote: Hello; The following two sample codes produce two completely different results !! SUB TEST1() ' ... my code... SUM = 0.0 FOR I = 1 TO 20000 SUM = SUM + A(I)*B(I)/C(I) NEXT RESULT1 = SUM ' ...my code... END SUB SUB TEST2() ' ... my code... SUM = 0.0 FOR I = 1 TO 20000 SUM = SUM + A(I)*B(I)/C(I) /1000. NEXT RESULT2 = SUM*1000. ' ...my code END SUB Variables: Result1, Result2, SUM, A,B,C are DOUBLE data types Values of A,B,C vary from very small to very large, in no specific order For a typical set of analytical data, Result1 = 0.9986, while Result2 = 4.2432, a considerable difference !! The different results are most likely associated with accumilated round-off errors known to be associated with floating-point arithmetic precission. I would very much appreciate your suggestion(s) on how to minimize this serious round-off error. Can one use a higher precission than DOUBLE ... either in VBA or FORTRAN ?? Thank you kindly. |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
A Serious Round-Off Problem !!
Yes, you have to understand your data and design your algorithm to give you
the best answer. -- Regards, Tom Ogilvy "monir" wrote in message ... Hello; The following two sample codes produce two completely different results !! SUB TEST1() ' ... my code... SUM = 0.0 FOR I = 1 TO 20000 SUM = SUM + A(I)*B(I)/C(I) NEXT RESULT1 = SUM ' ...my code... END SUB SUB TEST2() ' ... my code... SUM = 0.0 FOR I = 1 TO 20000 SUM = SUM + A(I)*B(I)/C(I) /1000. NEXT RESULT2 = SUM*1000. ' ...my code END SUB Variables: Result1, Result2, SUM, A,B,C are DOUBLE data types Values of A,B,C vary from very small to very large, in no specific order For a typical set of analytical data, Result1 = 0.9986, while Result2 = 4.2432, a considerable difference !! The different results are most likely associated with accumilated round-off errors known to be associated with floating-point arithmetic precission. I would very much appreciate your suggestion(s) on how to minimize this serious round-off error. Can one use a higher precission than DOUBLE ... either in VBA or FORTRAN ?? Thank you kindly. |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
A Serious Round-Off Problem !!
Hello;
Here's more relevant info. that might clarify the problem: SUB TEST3() ' ... my code... mySum1 = 0.0 mySum2 = 0.0 mySum3 = 0.0 FOR I = 1 TO 30000 mySum1 = mySum1 + A(I)*B(I)/C(I) mySum2 = mySum2 + A(I)*B(I)/C(I) /1000. mySum3 = mySum3 + A(I)*B(I)/C(I) *1000. NEXT Result1 = mySum1 Result2 = mySum2 *1000. Result3 = mySum3 /1000. ' ...my code... END SUB Variables: Result1, Result2, Result3, mySum1, mySum2, mySum3, A,B,C are DOUBLE data type. Values in A,B,C vary from very small to very large, and could be +ve or -ve, in no specific order. For a typical set of analytical data, the above sample code produces: ....Result1 = 0.9986 ........Result2 = 4.2432 ............Result3 = 2.5388 The specified accuracies in computing the arrays' elements A,B,C are 1.D-8, 5.D-8, 1.D-8 respectively, which are quite reasonable realizing the complexity of the computational methods. Increasing the specified accuracies would adversely affect the stability of the numerical processes and their conditioning. The FOR loop in the above sample code involves multiplication, division, and addition of a wide range of a large number of small and large, +ve and -ve real numbers. How the processor handles each two operands at each stage, individually and in combinations, and how the intermediate results are represented and stored (in memory), appear to be at the root of the problem here. (These specialized topics are beyond my understanding of computing!!). Sure enough, some real numbers of "nearly" same value but opposite sign could possibly be encountered in A*B/C, but (I think) should not influence the resulting value in mySum = mySum+A*B/C, provided no unnecessary rounding takes place. The situation is considerably different from an ill-posed or ill-conditioned problem That said, let me re-phrase the question: Given one-dimensional arrays A, B, C, each of size N (=20000 to 30000), and values of arrays' elements vary between, say, +/- 1.E-5 and +/- 1.E+5 (i.e.; the value of A*B/C could potentially vary between -1.E-15 and +1.E+15), the question is: WHAT IS THE BEST WAY TO PERFORM THE FOLLOWING ARITHMETIC OPERATION WITH MINIMUM ROUND-OFF ERROR ??? Code: mySum = 0.0 FOR I = 1 TO 20000 mySum = mySum + A(I)*B(I)/C(I) NEXT Result = mySum Any suggestion(s) ?? Best regards. "Tom Ogilvy" wrote: Yes, you have to understand your data and design your algorithm to give you the best answer. -- Regards, Tom Ogilvy "monir" wrote in message ... Hello; The following two sample codes produce two completely different results !! SUB TEST1() ' ... my code... SUM = 0.0 FOR I = 1 TO 20000 SUM = SUM + A(I)*B(I)/C(I) NEXT RESULT1 = SUM ' ...my code... END SUB SUB TEST2() ' ... my code... SUM = 0.0 FOR I = 1 TO 20000 SUM = SUM + A(I)*B(I)/C(I) /1000. NEXT RESULT2 = SUM*1000. ' ...my code END SUB Variables: Result1, Result2, SUM, A,B,C are DOUBLE data types Values of A,B,C vary from very small to very large, in no specific order For a typical set of analytical data, Result1 = 0.9986, while Result2 = 4.2432, a considerable difference !! The different results are most likely associated with accumilated round-off errors known to be associated with floating-point arithmetic precission. I would very much appreciate your suggestion(s) on how to minimize this serious round-off error. Can one use a higher precission than DOUBLE ... either in VBA or FORTRAN ?? Thank you kindly. |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
A Serious Round-Off Problem !!
"monir" wrote: Hello; The following two sample codes produce two completely different results !! SUB TEST1() ' ... my code... SUM = 0.0 FOR I = 1 TO 20000 SUM = SUM + A(I)*B(I)/C(I) NEXT RESULT1 = SUM ' ...my code... END SUB SUB TEST2() ' ... my code... SUM = 0.0 FOR I = 1 TO 20000 SUM = SUM + A(I)*B(I)/C(I) /1000. NEXT RESULT2 = SUM*1000. ' ...my code END SUB Variables: Result1, Result2, SUM, A,B,C are DOUBLE data types Values of A,B,C vary from very small to very large, in no specific order For a typical set of analytical data, Result1 = 0.9986, while Result2 = 4.2432, a considerable difference !! The different results are most likely associated with accumilated round-off errors known to be associated with floating-point arithmetic precission. I would very much appreciate your suggestion(s) on how to minimize this serious round-off error. Can one use a higher precission than DOUBLE ... either in VBA or FORTRAN ?? Thank you kindly. |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
A Serious Round-Off Problem !!
"monir" wrote:
I would very much appreciate your suggestion(s) on how to minimize this serious round-off error. Can one use a higher precission than DOUBLE ... either in VBA or FORTRAN ?? Sort by absolute value and sum from smallest to largest. Jerry |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
A Serious Round-Off Problem !!
You are probably on the right track in thinking that the problem is due to
inaccuracies of trying to do math involving very large and very small numbers in a digital world. Let's look at the original formulas you gave that caused such a huge variation in results: 1st case you used SUM = SUM + A(I)*B(I)/C(I) and came up with a result of .9986 in the second formula you use SUM = SUM + A(I)*B(I)/C(I)/1000 and once out of the loop you then revised the SUM by RESULT2 = SUM * 1000 which gave you 4.2432 In the second case, you've not considered the whole issue of precedence of calculations. Try this instead and see if the results aren't closer for the second case: Dim mySUM as Double mySUM = 0 For I = 1 to 20000 mySUM = (mySUM + A(I)*B(I)/C(I))/1000 Next Result2 = mySUM * 1000 The difference is that in your original 2nd example, all of the multiplications and division gets carried out BEFORE the previous content of mySUM is added to the result, including that last /1000. This changes the whole thing. Don't know why I didn't catch that before. Then afterwards you multiply the whole thing by 1000 but the whole thing was never divided by 1000 within the loop. Examples: Lets say that at the beginning of the 2nd formula, mySUM has a value of 1 and that A(I) = 4, B(I) = 3 and C(I) = 2 then mySUM = mySUM + A(I)*B(I)/C(I)/1000 has a result of 1.006 but mySUM = (mySUM + A(I)*B(I)/C(I)) /1000 has a result of .007 (kind of an ultra-mini-James Bond result) Both answers are equally correct - but which one is correct for your circumstances depends on what you really meant to do. In the first example (1.006 result) the math is actually handled like this: calculate the value of A(I)*B(I)/C(I)/1000 and THEN add the value of mySUM to that result. In the second example the math is done like this: calculate the value of A(I)*B(I)/C(I) and then add the value of mySum to that result, and THEN divide that result by 1000 Use parenthesis pairs to control what numbers are acted upon in what sequence. That is going to be determined by the original formula or algorithm that defines how the numbers should be processed. For more help, look in Excel Help and search for "about calculation operators" - then look in the returned list of topics for one with that same phrase (shows up 2nd in the list that Excel 2003 returned for me). It explains it all in more detail with regards to hierarchy/precendence of operators and how things get done. "monir" wrote: Hello; Here's more relevant info. that might clarify the problem: SUB TEST3() ' ... my code... mySum1 = 0.0 mySum2 = 0.0 mySum3 = 0.0 FOR I = 1 TO 30000 mySum1 = mySum1 + A(I)*B(I)/C(I) mySum2 = mySum2 + A(I)*B(I)/C(I) /1000. mySum3 = mySum3 + A(I)*B(I)/C(I) *1000. NEXT Result1 = mySum1 Result2 = mySum2 *1000. Result3 = mySum3 /1000. ' ...my code... END SUB Variables: Result1, Result2, Result3, mySum1, mySum2, mySum3, A,B,C are DOUBLE data type. Values in A,B,C vary from very small to very large, and could be +ve or -ve, in no specific order. For a typical set of analytical data, the above sample code produces: ...Result1 = 0.9986 .......Result2 = 4.2432 ...........Result3 = 2.5388 The specified accuracies in computing the arrays' elements A,B,C are 1.D-8, 5.D-8, 1.D-8 respectively, which are quite reasonable realizing the complexity of the computational methods. Increasing the specified accuracies would adversely affect the stability of the numerical processes and their conditioning. The FOR loop in the above sample code involves multiplication, division, and addition of a wide range of a large number of small and large, +ve and -ve real numbers. How the processor handles each two operands at each stage, individually and in combinations, and how the intermediate results are represented and stored (in memory), appear to be at the root of the problem here. (These specialized topics are beyond my understanding of computing!!). Sure enough, some real numbers of "nearly" same value but opposite sign could possibly be encountered in A*B/C, but (I think) should not influence the resulting value in mySum = mySum+A*B/C, provided no unnecessary rounding takes place. The situation is considerably different from an ill-posed or ill-conditioned problem That said, let me re-phrase the question: Given one-dimensional arrays A, B, C, each of size N (=20000 to 30000), and values of arrays' elements vary between, say, +/- 1.E-5 and +/- 1.E+5 (i.e.; the value of A*B/C could potentially vary between -1.E-15 and +1.E+15), the question is: WHAT IS THE BEST WAY TO PERFORM THE FOLLOWING ARITHMETIC OPERATION WITH MINIMUM ROUND-OFF ERROR ??? Code: mySum = 0.0 FOR I = 1 TO 20000 mySum = mySum + A(I)*B(I)/C(I) NEXT Result = mySum Any suggestion(s) ?? Best regards. "Tom Ogilvy" wrote: Yes, you have to understand your data and design your algorithm to give you the best answer. -- Regards, Tom Ogilvy "monir" wrote in message ... Hello; The following two sample codes produce two completely different results !! SUB TEST1() ' ... my code... SUM = 0.0 FOR I = 1 TO 20000 SUM = SUM + A(I)*B(I)/C(I) NEXT RESULT1 = SUM ' ...my code... END SUB SUB TEST2() ' ... my code... SUM = 0.0 FOR I = 1 TO 20000 SUM = SUM + A(I)*B(I)/C(I) /1000. NEXT RESULT2 = SUM*1000. ' ...my code END SUB Variables: Result1, Result2, SUM, A,B,C are DOUBLE data types Values of A,B,C vary from very small to very large, in no specific order For a typical set of analytical data, Result1 = 0.9986, while Result2 = 4.2432, a considerable difference !! The different results are most likely associated with accumilated round-off errors known to be associated with floating-point arithmetic precission. I would very much appreciate your suggestion(s) on how to minimize this serious round-off error. Can one use a higher precission than DOUBLE ... either in VBA or FORTRAN ?? Thank you kindly. |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
A Serious Round-Off Problem !!
Hope this will convince you that you have misread the intent. Successively
diving the subtotal by 1000, 20000 times isn't going to be reversed by multiplying it by 1000 one time Sub abc() Dim mySUM As Double A = 1 B = 1 C = 1 mySUM = 0 For I = 1 To 20000 mySUM = mySUM + A * B / C Next result1 = mySUM ' your suggested fix: mySUM = 0 For I = 1 To 20000 mySUM = (mySUM + A * B / C) / 1000 Next result2 = mySUM * 1000 Debug.Print result1, result2 End Sub This results in: 20000 1.001001001001 perhaps you meant: mySUM = mySUM + (A * B / C) / 1000 but that does ignore the precedence rules. Mathematically, there was nothing wrong with the original. -- Regards, Tom Ogilvy "JLatham" <HelpFrom @ Jlathamsite.com.(removethis) wrote in message ... You are probably on the right track in thinking that the problem is due to inaccuracies of trying to do math involving very large and very small numbers in a digital world. Let's look at the original formulas you gave that caused such a huge variation in results: 1st case you used SUM = SUM + A(I)*B(I)/C(I) and came up with a result of .9986 in the second formula you use SUM = SUM + A(I)*B(I)/C(I)/1000 and once out of the loop you then revised the SUM by RESULT2 = SUM * 1000 which gave you 4.2432 In the second case, you've not considered the whole issue of precedence of calculations. Try this instead and see if the results aren't closer for the second case: Dim mySUM as Double mySUM = 0 For I = 1 to 20000 mySUM = (mySUM + A(I)*B(I)/C(I))/1000 Next Result2 = mySUM * 1000 The difference is that in your original 2nd example, all of the multiplications and division gets carried out BEFORE the previous content of mySUM is added to the result, including that last /1000. This changes the whole thing. Don't know why I didn't catch that before. Then afterwards you multiply the whole thing by 1000 but the whole thing was never divided by 1000 within the loop. Examples: Lets say that at the beginning of the 2nd formula, mySUM has a value of 1 and that A(I) = 4, B(I) = 3 and C(I) = 2 then mySUM = mySUM + A(I)*B(I)/C(I)/1000 has a result of 1.006 but mySUM = (mySUM + A(I)*B(I)/C(I)) /1000 has a result of .007 (kind of an ultra-mini-James Bond result) Both answers are equally correct - but which one is correct for your circumstances depends on what you really meant to do. In the first example (1.006 result) the math is actually handled like this: calculate the value of A(I)*B(I)/C(I)/1000 and THEN add the value of mySUM to that result. In the second example the math is done like this: calculate the value of A(I)*B(I)/C(I) and then add the value of mySum to that result, and THEN divide that result by 1000 Use parenthesis pairs to control what numbers are acted upon in what sequence. That is going to be determined by the original formula or algorithm that defines how the numbers should be processed. For more help, look in Excel Help and search for "about calculation operators" - then look in the returned list of topics for one with that same phrase (shows up 2nd in the list that Excel 2003 returned for me). It explains it all in more detail with regards to hierarchy/precendence of operators and how things get done. "monir" wrote: Hello; Here's more relevant info. that might clarify the problem: SUB TEST3() ' ... my code... mySum1 = 0.0 mySum2 = 0.0 mySum3 = 0.0 FOR I = 1 TO 30000 mySum1 = mySum1 + A(I)*B(I)/C(I) mySum2 = mySum2 + A(I)*B(I)/C(I) /1000. mySum3 = mySum3 + A(I)*B(I)/C(I) *1000. NEXT Result1 = mySum1 Result2 = mySum2 *1000. Result3 = mySum3 /1000. ' ...my code... END SUB Variables: Result1, Result2, Result3, mySum1, mySum2, mySum3, A,B,C are DOUBLE data type. Values in A,B,C vary from very small to very large, and could be +ve or -ve, in no specific order. For a typical set of analytical data, the above sample code produces: ...Result1 = 0.9986 .......Result2 = 4.2432 ...........Result3 = 2.5388 The specified accuracies in computing the arrays' elements A,B,C are 1.D-8, 5.D-8, 1.D-8 respectively, which are quite reasonable realizing the complexity of the computational methods. Increasing the specified accuracies would adversely affect the stability of the numerical processes and their conditioning. The FOR loop in the above sample code involves multiplication, division, and addition of a wide range of a large number of small and large, +ve and -ve real numbers. How the processor handles each two operands at each stage, individually and in combinations, and how the intermediate results are represented and stored (in memory), appear to be at the root of the problem here. (These specialized topics are beyond my understanding of computing!!). Sure enough, some real numbers of "nearly" same value but opposite sign could possibly be encountered in A*B/C, but (I think) should not influence the resulting value in mySum = mySum+A*B/C, provided no unnecessary rounding takes place. The situation is considerably different from an ill-posed or ill-conditioned problem That said, let me re-phrase the question: Given one-dimensional arrays A, B, C, each of size N (=20000 to 30000), and values of arrays' elements vary between, say, +/- 1.E-5 and +/- 1.E+5 (i.e.; the value of A*B/C could potentially vary between -1.E-15 and +1.E+15), the question is: WHAT IS THE BEST WAY TO PERFORM THE FOLLOWING ARITHMETIC OPERATION WITH MINIMUM ROUND-OFF ERROR ??? Code: mySum = 0.0 FOR I = 1 TO 20000 mySum = mySum + A(I)*B(I)/C(I) NEXT Result = mySum Any suggestion(s) ?? Best regards. "Tom Ogilvy" wrote: Yes, you have to understand your data and design your algorithm to give you the best answer. -- Regards, Tom Ogilvy "monir" wrote in message ... Hello; The following two sample codes produce two completely different results !! SUB TEST1() ' ... my code... SUM = 0.0 FOR I = 1 TO 20000 SUM = SUM + A(I)*B(I)/C(I) NEXT RESULT1 = SUM ' ...my code... END SUB SUB TEST2() ' ... my code... SUM = 0.0 FOR I = 1 TO 20000 SUM = SUM + A(I)*B(I)/C(I) /1000. NEXT RESULT2 = SUM*1000. ' ...my code END SUB Variables: Result1, Result2, SUM, A,B,C are DOUBLE data types Values of A,B,C vary from very small to very large, in no specific order For a typical set of analytical data, Result1 = 0.9986, while Result2 = 4.2432, a considerable difference !! The different results are most likely associated with accumilated round-off errors known to be associated with floating-point arithmetic precission. I would very much appreciate your suggestion(s) on how to minimize this serious round-off error. Can one use a higher precission than DOUBLE ... either in VBA or FORTRAN ?? Thank you kindly. |
#9
Posted to microsoft.public.excel.programming
|
|||
|
|||
A Serious Round-Off Problem !!
Hello;
I've successfully followed certain guidelines on how additions & subtractions should be done when rounding is a serious issue. At least for now, I'm getting some consistency in the results, and will post the entire process once it's fully verified. Here's the latest: Code:
SUB TEST4() ' ... my code... Factor = 'see below mySum = 0.0 FOR I = 1 TO 30000 mySum = mySum + A(I)*B(I)/C(I) /(Factor) NEXT Result = mySum *(Factor) ' ...my code... END SUB =.....1.0..........10.0........100.0........1000.0 .....5000.0.....10000.0 Result = 0.85933...0.85975....0.31101....0.11709....0.11709 ....0.11709 Factor =.....1.0........0.100........0.010........0.0010. ....0.0001 Result = 0.85933...0.85911....0.85916....0.85933....0.85933 The "make sense" result for the set of data is around the 0.11500 mark. The remaining inconsistency, I believe, is primarily due to how the term A*B/C is formulated, realizing the wide variation in the individual values. Each element in the arrays A, B & C has a value in the range, say, +/- 1.E-6 to +/- 1.E+6. NO zeros. The question is basically related to Arithmetic Precision in Multiplication and Division. It should not be confused with which operation is performed first and how the compiler does it, but rather how one would formulate the expression in the first place to minimize the round-off error. For example: would: 1.0/C*A*B be more accurate than A*B/C ? how about: A/C*B, or B*A/C ? would assigning intermediate variable help: D = A*B, Result = D/C or D = 1.0/C, E = A*B, Result = D*E etc. Surely, one may continue trying until a specific formulation is established for each set of data! But this trial & error approach would be cumbersome at best, and would be hardly described as scientific computing! Your suggestion(s) regarding Arithmetic Precision in Multiplication and in Division ... in VBA or FORTRAN ...would be greatly appreciated. Regards. "Jerry W. Lewis" wrote: "monir" wrote: I would very much appreciate your suggestion(s) on how to minimize this serious round-off error. Can one use a higher precission than DOUBLE ... either in VBA or FORTRAN ?? Sort by absolute value and sum from smallest to largest. Jerry |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
problem with round down | Excel Discussion (Misc queries) | |||
Round() problem | Excel Worksheet Functions | |||
Problem with the round function | Excel Programming | |||
How do I ROUND() round off decimals of a column dataset? | Excel Worksheet Functions | |||
Round problem with sorting | Excel Programming |