View Single Post
  #8   Report Post  
Dana DeLouis
 
Posts: n/a
Default t-distribution puzzle in Excel

repeating this exercise using new random numbers, I found the sample
kurtosis jumps around ~2.9 with a standard deviation of ~1.1. The
theoretical value for the excess kurtosis of t distribution is 6 / (df
- 4) = 6,


Hi. I'm not a Stats guy, so this is just for gee wiz.
Just for feedback, the math program Mathematica appears to agree with Excel.
Most of the values returned by KurtosisExcess are below 6.
I "think" that Excel (2003) is doing it correctly!

These values agree with what you mentioned...

Variance[StudentTDistribution[5]]}
5/3 (or 1.6666...)

KurtosisExcess[StudentTDistribution[5]]
6

( I know this isn't Excel, but I thought it would be easier to show it this
way instead of just the answers.)

= = = = = = = = = = =
Example #1 Data size = 30,000. The value of KurtosisExcess jumps
around. Some values were below 3, and above 10. df =5.
Here's 10 samples...

t=Table[KurtosisExcess[RandomArray[StudentTDistribution[5], 30000]], {10}]

{5.111078, 4.0498174, 3.8689816, 4.5970144, 7.0692163, 3.2919761,
5.9771902, 4.1000405, 3.23158, 4.8915233}

{Mean[t], StandardDeviation[t]}

{4.6188418, 1.2021682}


= = = = = = = = = = =
Example #2 Data size = 300,000 (Larger size)

t=Table[KurtosisExcess[RandomArray[StudentTDistribution[5], 300000]],
{10}]

{4.4799506, 4.6979051, 7.693889, 4.8315766, 6.7575222, 4.685617,
4.415599, 4.9943591, 6.166258, 4.404873}

{Mean[t], StandardDeviation[t]}

{5.312755, 1.1504864}

As you can see, KurtosisExcess is usually below 6 and varies quite a bit.
Maybe someone good at Stats can jump in and confirm.
--
Dana DeLouis
Win XP & Office 2003


wrote in message
oups.com...
I tried a couple of standard ways to generate random numbers that
follow Student's t-distribution. Something is very wrong, and I don't
know what.

Start with a simple example. In column A, I put 1000 uniformly
distributed on (0,1) numbers, using RAND(). Then I apply to them the
function TINV. E.g., in cell B1, I have

=IF(A1<0.5,-TINV(2*A1,df),TINV(2*(1-A1),df))

where df is my degrees of freedom. I fixed df = 5.

I obtain a column of (presumably) t-distributed numbers. I then check
their excess kurtosis (as calculated by the KURT() function). As I was
repeating this exercise using new random numbers, I found the sample
kurtosis jumps around ~2.9 with a standard deviation of ~1.1. The
theoretical value for the excess kurtosis of t distribution is 6 / (df
- 4) = 6, and the theoretical stdev of the sample kurtosis itself is
sqrt(24/1000) ~ 0.15. So I am puzzled as to why the excess kurtosis is
so incredibly far off. Also puzzled by why it's so unstable.

I then replaced Excel's rand() worksheet function with the
Mersenne-Twister implementation in NtRand
http://www.numtech.com/NtRand/.

Similar result (the sample excess kurtosis jumps wildly but still very
far below the theoretical value of 6).

I then added another column of random uniform (0,1) numbers (in column
C), and tried using instead the formula:

=NORMSINV(A1)/SQRT(df/GAMMAINV(C1,df/2,2))

Similar result.

I tried usign 60,000 instead of 1000 numbers. Now the theoretical stdev
of the sample kurtosis is just 0.02.

Similar result.

I checked the variance and skewness, and those match their values ( df
/ df - 2 = 1.667 and 0 respectively) very closely.

I checked the normal distribution, and its kurtosis matches the
theoretical perfectly.

I know I'm doing something wrong, but what? =(

Thanks....