ExcelBanter - View Single Post

[email protected]

On Sep 24, 6:38*am, jimbo wrote:
When 50% of a data set is a text value, and the rest is numeric,
typically the average is calculated by either making the text value a
numeric, then calculating the mean, or, ignoring the text value and
average the numeric values only.

For instance, the lowest detectable limit for serum aluminum may be
<5, which is a text value. The remaining data ranges between 5-100.
Making all <5s = 5 will overestimate the mean because the <5 values
range between 0 and 4.9, not 5. *By ignoring the <5 is worse because
if the distribution is gaussian, the mean should be close to 47, where
the other approach has a mean around 20.

But there should be a way to take the slope of the distribution as it
approaches the lower reportable limit and the number of text values
then determine the numeric equivalent of the values <5 and then, with
the numeric values, determine the mean of the data set as well as
other statistical measurements.

Does anyone know of such a calculation, a similar, or equivalent
calculation?

Replace "<5" with 2.45.

Or some other value if you know the distribution of the values between
0 & 4.9

Alan Lloyd