ExcelBanter - View Single Post

wrote...
Harlan Grove wrote:
....
Undue skepticism about documented functionality
isn't wisdom, it's paranoia.

In my case, it is based on decades of experience with
being on the oppposite side -- the person responsible
for implementing and supporting some functionality.
I know the value of the flexibility of undocumented
behavior -- the ability to evolve behavior judiciously.
I also know the "cop-out" value of undocumented behavior
-- the freedom not to support such behavior when it is
unwise to do so. And I know the danger of documenting
"too well" -- the inflexibility that can cause because
people depend on the documented behavior.

OK, but whether a PRNG is periodic or not is a fundamental operational
characteristic. When documented, it should be relied upon (in *EXACTLY*
the same way one should rely on numbers in Excel being carried to 15
and NO MORE THAN 15 decimal places of precision).

Whether paranoia is warranted when using Microsoft
software with Microsoft documentation is debatable.

My comments have nothing to do with the endless
parochial debates that some people like to engage in.
In fact, my comments were honed by experience with
software in another industry.

OK, so are you assuming you're the only participant in these newsgroups
with software development experience? And since the regulars span
occupational domains from aircraft design and manufacture, financial
services, academic statistics and mathematics as well as several
scientists and engineers, what other otherwise unrepresented industry
do you believe you represent?

Given the need for simulating sampling without
replacement, would there ever be hardware RNGs
without a library routine to produce samples without
replacement?

Sure! The hardware RNG I am familiar with does not.
Why should it? Why would you expect it of a hardware
RNG, if we don't see it with most software RNGs --
Excel, for example? ;-)
....

Precisely because software PRNGs (the 'P' stands for pseudo, and that
makes *ALL* the difference) are necessarily periodic. None have small
(<1E6) periods, so *ALL* are basically reliable for use in sampling
without replacement when population and sample sizes are smaller than
the period. It just requires a set of *UNROUNDED* deviates of the same
cardinality as the population from which you're sampling. So in Excel a
set of 15 RAND() calls *ALWAYS* and *RELIABLY* represents sampling from
more than 10^6 values strictly between 0 and 1 without replacement.
It's trivial to create 1-to-1 relations between such sets of deviates
and other sets of distinct values with the same cardinality.

Maybe some day in the distant future Excel will use hardware RNGs, but
it certainly doesn't now. Shouldn't you make use of current
*documented* functionality? Is there any sense in designing for
potential functionality that's unlikely to be available for years?