Outlier analysis in excel - trimming means
Correction: instead of ZTEST use:
=NORMDIST(Rng,AVERAGE(Rng),STDEV(Rng),1)
(I just noticed ZTEST divides the standard deviation by the square root
of the number of observaions which is used for testing sample means not
observations)
Lori wrote:
If you are assuming normally distributed data - generally a good
approximation in large samples - you could use the ZTEST function to
give the two tailed P-value of the datapoints. Assuming data is down a
column you could fill down:
=ZTEST(Rng,Rng)
and filter out anything below 2.5% or above 97.5%, this should give
equivalent results to the methods above (actually this is 1.96 sigma,
NORMSDIST(2)=0.977, NORMSDIST(3)=0.999).
Shane Lindsay wrote:
I would like to use excel for outlier analysis removal - to calculate a
mean of a range but not include or change values that are 2 or 3
standard deviations above or below the mean. Excel has a trimmedmean
function but outlier removal based on the above criteria is considered
better for normally distributed data.
I found these formula posted before on this forum. They don't include
values 2 SD above or below the mean. And they work!
=AVERAGE(IF((ABS(rng-AVERAGE(rng)))(2*STDEV(rng)), "", rng))
=TRIMMEAN(Rng,COUNTIF(Rng,""&(AVERAGE(Rng)+2*STDE VP(Rng)))/COUNT(Rng))
I also need to know how many data points were "removed", and then
report that as a percentage, and couldn't figure out to that (bearing
in mind they could be missing data points).
An alternative method of outlier analysis is to:
a. replace outliers with the 2 standard deviations above or below value
b. replace outliers with the mean
Any suggestions on how to do these things would be gratefully received.
|