View Single Post
  #5   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Jerry W. Lewis Jerry W. Lewis is offline
external usenet poster
 
Posts: 837
Default What does Excel's RAND function really do

The original fortran code is provided at
http://lib.stat.cmu.edu/apstat/183
In the original article, Wichmann & Hill state "The algorithm produces
numbers rectangularly distributed between 0 and 1, excluding the end points
but, on some machines, rounding errors might very occasionally produce a
value of 0 precisely." Mathematically it is equivalent to a multiplicative
generator with modulus 27,817,185,604,309 (hence 0 is excluded).

The underlying mechanism produces an integer sequence and then scales it to
U(0,1), so such rounding errors (if present) would only impact the single
value, not the sequence. Wichmann and Hill claim that the integer sequence
has a cycle length of nearly 7E12, so observing a 0 (if the Excel
implementation has that problem, which I do not know) would be extremely rare.

There was a bug in Excel 2003's original implementation that would
frequently return negative numbers, so be sure to use the latest service
patch.

Diehard is no longer the standard battery of tests for random number
generation. Newer tests do find non-random patterns from this generator,
http://docs.python.org/lib/module-random.html
but it is certainly a marked improvement over the random number generator in
previous versions of Excel.

Jerry

"joeu2004" wrote:

On Nov 12, 9:44 am, Brakeshoe
wrote:
According to Excel's built in documentation RAND returns values greater than
or equal to zero and less than one. If this was true I believe I should
eventually see a random number equal to zero be returned.


Googling for "Excel RAND algorithm" (without quotes) found the
following explanation: http://support.microsoft.com/kb/828795 .

Arguably, the MS KB description is not dispositive because there is a
material error -- at least I presume so. It states that the algorithm
returns a random number "on [0,1]" (sic). Of course, the required
result is on [0,1), not [0,1].

On the other hand, if you look at the Fortran(!) implementation, we
see that the result is mod 1.0. That should not return 1.0. So
presumably the KB notation is simply a typo -- or written by someone
who is unfamiliar with the correct mathematical syntax.

Note that the KB article states that the RAND algorithm is capable of
returning "more than 10^13" pseudo-random numbers before repeating.
That is far fewer than the number of double floating-point values in
the range [0,1). I have not analyzed the algorithm to see if it can
return any number (or at least a large number) of sets of "more than
10^13" PRNs. Assuming that it can, as to whether or not zero is
included in any one set of 10^13 PRNs, that might depend, at least, on
how the PRNG is seeded.

This KB article does not explain that. I think there is another KB
article that does; but I am not taking the time now to track it down,
if indeed it exists. Typically, software-based PRNGs are seeded by
some manipulation of the time of day measured to the resolution
supported by the operating system.

HTH.