Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
happycow
 
Posts: n/a
Default mutiple regression help


When doing a multiple regression in excel, what are the meaning of these
out puts (that's are all in the same table)

the fist column has my dependent variable which is labeled here
"Intercept" and the independent variables, X2, X3, X4, X5

the second columns labeled "Coefficients" i think this column has the
slope values of each independent variable; X2, X3, X4, and X5. These
slope variables are in relation to all the other variables, so the
slope of X2 is effected by by X3, X4 and X5. These slope values are the
measure of how each independent variable effects the dependent variable.
I'm guessing this allows me to predict where added data will go in the
correlation. So for example if I want to predict how a film will do (my
regression has to do with film gross) according to this data, i would
multiple my variables from that movie (budget (X2), first weekend gross
(X3), users ratings(X4) and MPPA rating (X5)) to the corresponding
Coefficients values in this column. If what i am saying is right (or at
least partially right), i don't know what the value of Intercept is for,
since is from the dependent variable, it shouldn't have a slope value.


The second column which is labeled "Standard Error" I'm guessing (if
what I am saying above the coefficients values are right) is the
accuracy of the predications that can be made. I'm guessing the larger
the number the bigger the error.

The fourth column which is labeled "t Stat" i have no clue what it
means and how it contributes to my regression. I'm thinking it's some
type of testing but i don't understand what it's testing and why.

The fifth column which is labeled "P-value" is again something I don't
understand. I think it has to do something with "t-Stat". My other
theory is that it has to do something with probability. I really don't
know though.

The next two columns labeled "Lower 95%" and "Upper 95%", i believe
this is the limits of my correlations. I think that this allows one to
say that "i am 95% sure that the predicted data that lies between these
lowers and uppers can be predicted by the accuracy of my the values in
my "coefficients" column.

I also am wondering about the graph outputs, the first graph "Line Fit
Plot" outputs 4 scatter diagrams for each of my 4 independent
variables. the graphs looks like their comparing my dependent variable
(on the y axis) to a independent variable on the x axis. Is this just
showing the correlation and relationship of each independent variable
to the dependent variable. For each individual diagram, Is the
comparison being made and liner relationship (the direction the lines
seem to be going; positive, negative or none) based on just the
independents variable and the dependent variable, or is the independent
variable's slope taking into account the other 3 independent variables?

The second graph; the "Residual Plot Graph", does this show the measure
of stand error for each point? and the closer to 0 a point gets the
lesser the error?


--
happycow
------------------------------------------------------------------------
happycow's Profile: http://www.excelforum.com/member.php...o&userid=25701
View this thread: http://www.excelforum.com/showthread...hreadid=391430

  #2   Report Post  
Jerry W. Lewis
 
Posts: n/a
Default


happycow wrote:

When doing a multiple regression in excel, what are the meaning of these
out puts (that's are all in the same table)

the fist column has my dependent variable which is labeled here
"Intercept" and the independent variables, X2, X3, X4, X5

the second columns labeled "Coefficients" i think this column has the
slope values of each independent variable; X2, X3, X4, and X5. These
slope variables are in relation to all the other variables, so the
slope of X2 is effected by by X3, X4 and X5. These slope values are the
measure of how each independent variable effects the dependent variable.
I'm guessing this allows me to predict where added data will go in the
correlation. So for example if I want to predict how a film will do (my
regression has to do with film gross) according to this data, i would
multiple my variables from that movie (budget (X2), first weekend gross
(X3), users ratings(X4) and MPPA rating (X5)) to the corresponding
Coefficients values in this column. If what i am saying is right (or at
least partially right), i don't know what the value of Intercept is for,
since is from the dependent variable, it shouldn't have a slope value.



The predicted value at a given point (x1,x2,...x5) is
c0 + x1*c1 + x2*c2 + ... + x5*c5
where c0 is the intercept and c1,...c5 are the slope coefficients.

The second column which is labeled "Standard Error" I'm guessing (if
what I am saying above the coefficients values are right) is the
accuracy of the predications that can be made. I'm guessing the larger
the number the bigger the error.



Yes.


The fourth column which is labeled "t Stat" i have no clue what it
means and how it contributes to my regression. I'm thinking it's some
type of testing but i don't understand what it's testing and why.



The t statistic is computed as the coefficient divided by its standard
error. Small values indicate that the particular coefficient may not be
needed in the model. "Small" is generally defined in terms of p-values.


The fifth column which is labeled "P-value" is again something I don't
understand. I think it has to do something with "t-Stat". My other
theory is that it has to do something with probability. I really don't
know though.



Both guesses are hitting around the issue. If a particular coefficient
does not belong in the model (the true value is zero, so the observed
value is due to random variation), then the p-value is the probability
of observing by chance a coefficient as large as occurred with this data
set. Thus the smaller the p-value, the greater the likelihood that a
coefficient is really needed. A commonly used criteria is to assume
that if p<0.05, then there is strong evidence that the coefficient is
needed.


The next two columns labeled "Lower 95%" and "Upper 95%", i believe
this is the limits of my correlations. I think that this allows one to
say that "i am 95% sure that the predicted data that lies between these
lowers and uppers can be predicted by the accuracy of my the values in
my "coefficients" column.



The correct interpretation is that you are 95% confident that the
interval (Lower to Upper) contains the true value for the coefficient.
Note that the interval is random, while the coefficient is not (it is
merely unknown). In particular, for a given data set, the interval
either does or does not contain the true value (although you don't know
which is true). Thus your confidence is in the procedure that generated
the interval, not in the specific interval generated from the specific
data set. It is a subtle concept that is often misunderstood.


I also am wondering about the graph outputs, the first graph "Line Fit
Plot" outputs 4 scatter diagrams for each of my 4 independent
variables. the graphs looks like their comparing my dependent variable
(on the y axis) to a independent variable on the x axis. Is this just
showing the correlation and relationship of each independent variable
to the dependent variable. For each individual diagram, Is the
comparison being made and liner relationship (the direction the lines
seem to be going; positive, negative or none) based on just the
independents variable and the dependent variable, or is the independent
variable's slope taking into account the other 3 independent variables?



More or less.

The second graph; the "Residual Plot Graph", does this show the measure
of stand error for each point? and the closer to 0 a point gets the
lesser the error?


Residuals are observed values minus predicted values. If the model is
correct, each residual plot should appear to be uniformly distributed.
If there is a systematic pattern in one or more residual plots, then
there the model is probably inadequate.

All of these questions deal with standard concepts from any introductory
statistics course. I highly recommend that you take such a course or at
least read an introductory statistics text, since there is more to
understand than is likely to be imparted in a few newsgroup replies.

Jerry

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Mutiple Regression output happycow Excel Discussion (Misc queries) 2 July 29th 05 06:46 PM
Erroneous Regression on Residuals Scott Excel Discussion (Misc queries) 3 July 27th 05 01:53 AM
Does Excel use least squares regression to calculate trendlines? Trendy Charts and Charting in Excel 3 May 20th 05 07:03 AM
how do I do statistic (regression) in excel? what's an array? trish Excel Discussion (Misc queries) 1 May 7th 05 02:43 PM
Problem seting-up Regression Macro Confused VB Person Excel Discussion (Misc queries) 1 February 9th 05 08:05 AM


All times are GMT +1. The time now is 05:43 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"