Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 2,059
Default (redux) PasteSpecialValue doesnt dupe value exactly sometimes (!)

Sigh, I had forgotten about this old thread and did not see Jerry's
posting before today. I stumbled across it when I rediscovered the
problem recently and searched for any relevant threads. Imagine my
surprise! (A "senior moment".) But Google Groups does not permit me
to simply post a reply to the original thread :-(.

On Aug 5 2008, 11:51 am, Jerry W. Lewis
wrote:
there were millions of floating point numbers that
simply could not exist as a constant value in Excel.
You could generate them as a formula result, but if
you pasted that result as a value, then you lost some
of the trailing bits


Jerry seems to be suggesting that Excel converts the value to a
displayable constant (first 15 significant digits), then copies or
pastes that constant.

That was my first hypothesis, too, as noted in my original posting.
And that rule does seem to apply when pasting into other applications
such as Notepad and MSWord.

But as I noted in my original posting, I found a counter-example that
disproves that theory for copy-and-paste-special-value (CAPSV) in
Excel.

Specifically, I showed that 0.999999999999972 - 2^-53 CAPSVs exactly.
That is, the internal representation of the original and pasted value
is &h3FEFFFFF,FFFFFF03 [1]. The cell does display 0.999999999999972
when formatted to 15 sig digits. But its internal representation is
different from the internal representation of 0.999999999999972, which
is &h3FEFFFFF,FFFFFF04 [3]. The difference is exactly 2^-53, the
smallest one-bit incremental difference [2].

And FYI, this behavior (sometimes, but not always, altering the
internal representation) also happens when storing into the Value and
Value2 properties of Excel range objects in V BA. For example:

Dim x As Double, y as Double
y = 201
x = 100 * 2.01
Debug.Print "y: "; dbl2bin(y); " "; dbl2dec(y)
Debug.Print "x: "; dbl2bin(x); " "; dbl2dec(x)
Range("a1") = x
Debug.Print "A1: "; dbl2bin(Range("a1")); " "; dbl2dec(Range
("a1"))

DBL2BIN and DBL2DEC are my own UDFs that show the exact internal
representation in binary and decimal. The Debug.Print statements show
that A1 is "y", not "x"; to wit:

y: &h40692000,00000000 201
x: &h40691FFF,FFFFFFFF
200.999999999999,971578290569595992565155029296875
A1: &h40692000,00000000 201

I conclude that this behavior is driven by an Excel heuristic that
applies to the PasteSpecial xlPasteValues method and the Value and
Value2 properties -- and perhaps other similar operations that store
into Excel objects.

But that heuristic does not apply to UDF results. For example, the
following UDF returns exactly "x":

Function capsv()
Dim x As Double, y as Double
y = 201
x = 100 * 2.01
Debug.Print "y: "; dbl2bin(y); " "; dbl2dec(y)
Debug.Print "x: "; dbl2bin(x); " "; dbl2dec(x)
capsv = x
End Function

Based on many examples, my best explanation of the heuristic is: if
the original value rounded to its 15-sig-digit representation (the
"modified value") is represented exactly within 15 sig digits, convert
the original value to the modified value.

I think that explanation is "close". But, sigh, I found counter-
examples that contradict it.

First, to demonstrate the heuristic, consider the following, example
#1 (based on a random number):

y = 0.200653076171871
x = y + 2 ^ -55

y: &h3FC9AEFF,FFFFFF70
0.200653076171871,00319711134943645447492599487304 6875
x: &h3FC9AEFF,FFFFFF71
0.200653076171871,03095268696506536798551678657531 73828125
A1: &h3FC9AEFF,FFFFFF71
0.200653076171871,03095268696506536798551678657531 73828125

Note that A1 is "x", not "y". That fits the rule because
0.200653076171871 cannot be represented exactly within 15 sig digits,
as shown.

Second, consider example #2:

y = 0.200653076171875
x = y + 2 ^ -55

y: &h3FC9AF00,00000000 0.200653076171875
x: &h3FC9AF00,00000001
0.200653076171875,02775557561562891351059079170227 05078125
A1: &h3FC9AF00,00000000 0.200653076171875

A1 is "y", not "x". Again, that fits the rule because
0.200653076171875 is represented exactly within 15 sig digits.

My explanation of the heuristic seems to work. But now consider
example #3:

y = 200.653076171875
x = y + 2 ^ -45

y: &h406914E6,00000000 200.653076171875
x: &h406914E6,00000001
200.653076171875,028421709430404007434844970703125
A1: &h406914E6,00000001
200.653076171875,028421709430404007434844970703125

A1 is "x". I expected "y" because 200.653076171875 is represented
exactly within 15 sig digits.

Arguably, perhaps the heuristic applies only when the magnitude of the
difference between original and modified values is sufficiently small
-- something less than 2^-45.

But consider counter-example #4:

y = 200.640625
x = y + 2 ^ -45

y: &h40691480,00000000 200.640625
x: &h40691480,00000001
200.640625000000,028421709430404007434844970703125
A1: &h40691480,00000000 200.640625

A1 is "y", following the heuristic, despite the fact that the
magnitude of the difference is the same as example #3, and the
constant "y" in example #3 and #4 are similar in magnitude. (In fact,
the example #4 "y" was derived from example #3.)

So once again, I am out of theories.

Based on other examples, I suspect the heuristic might behave
differently under certain conditions, making it that much harder to
intuit.

But if anyone can demonstrate a viable heuristic, I'm all ears.
(Well, eyes. ;-)


Endnotes:

[1] The form &h3FEFFFFF,FFFFFF03 is my stylized representation of the
64-bit floating-point value in "little endian" order (aka "network
order"), which is not the storage order in Pentium-compatible
systems. The 12-bit sign and biased exponent are in the first 3 hex
characters ("3FE"), and the 52-bit mantissa follows ("F...F03"). The
comma separates upper and lower 32 bits.

[2] The operative phrase is "one-bit" incremental difference. For
example, the two-bit increment 2^-54 + 2^-55 is smaller than 2^-53,
but both =0.5 + 2^-53 and =0.5 + (2^-54 + 2^-55) result in the same
internal representation. That is, both differ from 0.5 by 2^-53,
thanks to the 80-bit FP representation and default FP rounding
algorithm implemented in Pentium-compatible FPUs.

[3] 0.999999999999972 - 2^-53 is stored exactly as
0.999999999999971,91135747698353952728211879730224 609375. (The comma
demarcates 15 significant digits to the left.) It displays as
0.999999999999972; but 0.999999999999972 is stored exactly as
0.999999999999972,02237977944605518132448196411132 8125.

----- original posting -----

On Aug 5 2008, 11:51 am, Jerry W. Lewis
wrote:
This issue is more common than you might imagine. Prior to 2007, there were
millions of floating point numbers that simply could not exist as a constant
value in Excel. You could generate them as a formula result, but if you
pasted that result as a value, then you lost some of the trailing bits.

Excel 2007 allows any valid floating point number to exist as a constant,
but that means that the display routines had never been tested for millions
of values, some of which did not initially display properly -- values just
below 2^16-1 displayed as 100000 and values just below 2^16 displayed as
100001, e.g.http://groups.google.com/group/micro...rashesgpfs/msg...

Jerry



"joeu2004" wrote:
Usually when I copy-and-paste-special-value a numeric value,
the target cell gets exactly the same binary value that the
original cell had.


But I stumbled upon one case (I'm sure there are others) where
that is not true. That is, the pasted value has a different binary
representation.


Under what conditions does this happen?


Consider the following two examples. (The binary value is on
the right. I use Excel 2003.)


1. Unusual(?) case.


A1: 2.00666666666667
A2: =100*round(A1,2) // &h40691FFF,FFFFFFFF
A3: 201 // &h40692000,00000000
A4: copy-and-PSV from A2 // &h40692000,00000000


A2 displays the same character repesentation as A3, even in
Scientific format with 14 DP. But the binary values differ
by 2^-45. As shown, the pasted binary value of A4 is the
same as A3, not A2.


2. Usual(?) case.


B1: 0.999999999999972 // &h3FEFFFFF,FFFFFF04
B2: =B1 - 2^-53 // &h3FEFFFFF,FFFFFF03
B3: copy-and-PSV from B2 // &h3FEFFFFF,FFFFFF03


Again, B1 and B2 display the same character representation
in Scientific with 14 DP. But the binary values differ by
2^-53. Contrary to Example 1, the binary value of B3 is the
same as B2, not B1.


I say "contrary" because to explain Example 1, I hypothesized
that when a value is copied to the Clipboard, Excel converts
the binary value to its most precise character representation,
then converts from the character representation to binary when
we PSV. Example 2 seems to contradict that theory.


Every other theory I've hypothesized does not seem to explain
when PSV will and will not change the binary value.


Example 2 differs by a much smaller amount than Example 1.
The difference in Example 2 is even a slightly smaller percentage
(about 1.11E-16) of the binary value associated with the character
representation than the difference in Example 1 (about 1.414E-E16).


Both examples differ in the least-significant bit. And the LSB
differs in the same "direction", namely from odd to even, which
may be significant when considering the (default) rules for rounding
from 80-bit to 64-bit floating-point in the Intel FPU.

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Why PasteSpecialValue doesnt always dupe binary value exactly? joeu2004 Excel Discussion (Misc queries) 2 August 5th 08 08:51 PM
find and return adjacent value (redux)! Todd Lietha Excel Discussion (Misc queries) 3 September 26th 07 01:59 AM
De-Dupe Records Based on Two Fields. onlythebest Excel Discussion (Misc queries) 2 July 17th 06 06:52 PM
how can i de-dupe email lists using excel? Guy Timson Excel Worksheet Functions 1 February 17th 06 10:54 PM
how do i "de-dupe?" rows by a column value? hailah Excel Discussion (Misc queries) 1 May 20th 05 04:04 AM


All times are GMT +1. The time now is 12:00 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"