Biometrika Tables for Statisticians, vol 1, Tables 34B and 34C give critical
values for testing normality with the sample skewness and sample kurtosis.
The largest sample size in the table is 2000, where |Kurtosis|0.3 represents
a statistically significant departure from normality.
However, with sample sizes in your range, you have tremendous power to
detect minor departures of little or no practical meaning. Unless there is a
useful alternative model for the analysis, I would tend to look a probability
plot
http://www.usfca.edu/~middleton/demand.pdf
instead of a hypothesis test to assess normality.
Jerry
"Quco" wrote:
Thank you for your super-fast response Dave!
Now, before I posted this today, I did some research on the internet and I
did found the same document in Wikipedia. I wish it could be easier to
determine if there is a limit for this values, let's say -1 to +1, where 0 is
the ideal value for the normal distribution.
The problem is, what can I conclude when I see things like Kurtosis =
2.94948 and Skewness = 1.01299, Kurtosis = 3.72673 and Skewness = 1.81848,
etc... the purpose of this project is to test for normality as we move
forward identifying and removing outliers (the given values do have outliers
present). In theory, we should get a nearly close normal distribution as we
remove outliers after careful examination.
One more thing I am concerned of, is the Excel help file when it is limiting
the values to n=255. If I have samples much greater than that, are the
functions just taking the first 255 values??? that's a little bit confusing.
Finally, is there any other rigorous test (from the ones listed in
Wikipedia) built-in in Excel?
"Dave F" wrote:
Wikipedia has a pretty good discussion of kurtosis. See, especially, the
section "Kurtosis of well known distributions":
http://en.wikipedia.org/wiki/Kurtosis (standard disclaimers about relying on
Wikipedia apply). A normal distribution should have a kurtosis of 0.
A normal distribution should also have a skew of 0.
So, based on the results you give, none of your distributions are normal.
But they don't seem to be either extremely peaked or extremely skewed.
Positive numbers for SKEW indicate that the distribution is skewed to the
right and negative skew indicates that the distribution is skewed to the left.
Does this help give context to the numbers you're getting?
Dave
--
A hint to posters: Specific, detailed questions are more likely to be
answered than questions that provide no detail about your problem.
"Quco" wrote:
I have 5 independent datasets that I am analyzing with descriptive statistics
and Gaussian (normal) distributions to search for outliers. During the
process, I decided to see if there is a way to €śtest€ť a sample for
€śnormality€ť since Im getting some graphs that do not represent normal
distributions. I want to know why. So, I found the functions KURT and SKEW,
but I need some advice about their significance. I do not understand the
values I am getting from these functions and the help files in Excel 2007 are
not very specific. (I wish they could expand more the description listed
under €śExcel Home Function reference Statistical€ť). This is what they say:
// KURT(number1,number2,...) Returns the kurtosis of a data set. Kurtosis
characterizes the relative peakedness or flatness of a distribution compared
with the normal distribution. Positive kurtosis indicates a relatively peaked
distribution. Negative kurtosis indicates a relatively flat distribution.
Number1, number2, ... are 1 to 255 arguments for which you want to
calculate kurtosis. You can also use a single array or a reference to an
array instead of arguments separated by commas. //
// SKEW(number1,number2,...) Returns the skewness of a distribution.
Skewness characterizes the degree of asymmetry of a distribution around its
mean. Positive skewness indicates a distribution with an asymmetric tail
extending toward more positive values. Negative skewness indicates a
distribution with an asymmetric tail extending toward more negative values.
Number1, number2 ... are 1 to 255 arguments for which you want to calculate
skewness. You can also use a single array or a reference to an array instead
of arguments separated by commas. //
For instance:
My datasets range from 75,00 entries to 13,845 entries. I am getting a
Kurtosis = 0.71808 and Skewness = -0.07325 for one dataset, and Kurtosis =
0.59906 and Skewness = 0.91872 for another one, but I dont know what they
mean.
Can anybody help me to discuss/explain/expand these values?