Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
showcase48
 
Posts: n/a
Default Regression with more than 16 variables

I noticed linest and the regrssion tool doesn't allow more than 16 variables.
Does anyone know the most convenient way to do this? I currently don't have
SAS. Is it possible in Access or anything else?
  #2   Report Post  
Alex
 
Posts: n/a
Default

Without being pessimistic I think you'll struggle to do this mutlivariate
regression in excel. There may be a way but it will be long-winded I am sure.

As a suggestion, stats websites often allow you to input data and see the
output running off Javascript as an example. Try and find a website that you
can cut and paste your data to and crank the analysis...

SAS could do it easily. The R project (google it to find it) offers a free
stats package. It is the update of S and is pretty much as good as it gets.
Upside is that it is free, downside is that it is not user friendly in that
you would have to put some hours in before you got around to doing your
multivariate regression.

Sorry to be a naysayer....


Alex

"showcase48" wrote:

I noticed linest and the regrssion tool doesn't allow more than 16 variables.
Does anyone know the most convenient way to do this? I currently don't have
SAS. Is it possible in Access or anything else?

  #3   Report Post  
Jerry W. Lewis
 
Posts: n/a
Default

You could directly implement the equations that were used in Excel prior
to 2003
http://groups-beta.google.com/group/...595e596d?hl=en

Also, the address of the R project is www.r-project.org, and there is an
supplemental package for an R-Excel interface

Jerry

showcase48 wrote:

I noticed linest and the regrssion tool doesn't allow more than 16 variables.
Does anyone know the most convenient way to do this? I currently don't have
SAS. Is it possible in Access or anything else?


  #4   Report Post  
MrShorty
 
Posts: n/a
Default


How many rows/columns will the built in matrix functions (mmult,
minverse, etc.) handle? Least squares regression "reduces" to some
fairly simple matrix algebra which can be performed using the built in
matrix functions. Most of the LSQ regression I do, I prefer to use the
matrix notation rather than LINEST or LOGEST, but I've never had to
regress on more than 5 or 6 parameters. I haven't heard of any limit
to the size of matrix these equations can handle, so, in theory anyway,
you should be able to perform the regression this way.


--
MrShorty
------------------------------------------------------------------------
MrShorty's Profile: http://www.excelforum.com/member.php...o&userid=22181
View this thread: http://www.excelforum.com/showthread...hreadid=381662

  #5   Report Post  
Mike Middleton
 
Posts: n/a
Default

MrShorty -

How many rows/columns will the built in matrix functions (mmult, minverse,
etc.) handle? Least squares regression "reduces" to some fairly simple
matrix algebra which can be performed using the built in matrix functions.
Most of the LSQ regression I do, I prefer to use the matrix notation
rather than LINEST or LOGEST, but I've never had to regress on more than 5
or 6 parameters. I haven't heard of any limit to the size of matrix these
equations can handle, so, in theory anyway, you should be able to perform
the regression this way. <


Some of the limitations, e.g., maximum 52 columns by 52 rows for MINVERSE,
are described at the end of the following Knowledge Base article:

http://support.microsoft.com/default...b;en-us;166342

- Mike
www.mikemiddleton.com




  #6   Report Post  
MrShorty
 
Posts: n/a
Default


Based on the information in that article, once could (assuming enough
RAM, processor time, and such) theoretically do a least squares
regression for up 52 parameters, then.


--
MrShorty
------------------------------------------------------------------------
MrShorty's Profile: http://www.excelforum.com/member.php...o&userid=22181
View this thread: http://www.excelforum.com/showthread...hreadid=381662

  #7   Report Post  
Jerry W. Lewis
 
Posts: n/a
Default

MrShorty wrote:

Based on the information in that article, once could (assuming enough
RAM, processor time, and such) theoretically do a least squares
regression for up 52 parameters, then.


Theoretically yes, but forming the normal equations is a numerically
poor way to do least squares, since numerical singularity is then
governed by the ratio of largest to smallest eigenvalues in X'X instead
of that ratio in X. MINVERSE will often give a numerically inaccurate
solution instead of warning of approximate numerical singularity.
Consequently I would tend distrust the output.

This issue in 1 dimension is the difference between calculating
=SUMSQ(data)-SUM(data)^2/COUNT(data)
(as was done by VAR, STDEV, etc prior to Excel 2003) and calculating
=DEVSQ(data)
The first form becomes more and more inaccurate as the CV of the data
decreases. For example, STDEV(x,x+0.1) should be 0.0707106781186548
=SQRT(0.005) regardless of x. But for x=100000 (CV=0.00007%) the first
form (and STDEV in Excel 2002 and earlier) is inaccurate in the 4th
significant figure of the standard deviation calculation. Between
1000000 and 10000000, all significant figures are lost in the first
calculation (summing squared x's has pushed the interesting part of the
calculation beyond the accuracy if IEEE double precision). While the
second form continues to work well.

Personally, I can't remember ever being asked to analyze data with a CV
that small, (and if I did, I would have used coding (linear
transformation to a more practical range) to avoid numerical problems,
but things are not so clear in the multiple dimensions of the general
linear model. For instance, if you fit a 6th degree polynomial where
the x data is 20.0, 20.5, 21.0, 21.5, ..., 27.0, the condition number of
X'X is around 10^30, suggesting that direct solution of the normal
equations would require quadruple precision to calculate it
http://groups-beta.google.com/group/...9a2bb33e6cdbb8
Pre-2003 LINEST (or the direct MINVERSE() formula linked from my
previous post) will try, but will get no figures correct without
warning. SAS PROC GLM and PROC REG will complain of numerical
singularity (SAS's use of the sweep operator necessarily requires it to
form the normal equations). That is why books on statistical computing
recommend other approaches, such as Givens rotations, as implemented in
http://lib.stat.cmu.edu/apstat/274
http://lib.stat.cmu.edu/apstat/75
the lm() function in S-PLUS and R will get 8 figures correct for each
coefficient in the preceding example despite the ill-conditioning.
Excel's trendline will get 9 figures correct, but will not go beyond a
6th degree polynomial. There is a contributed package for R that would
allow you to pass data from Excel to R, analyze it in R, and pass the
result back to Excel. To learn more about R, go to
http://www.r-project.org

Jerry

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How do you populate the X variables in Regression analysis for mo. tjp32 Excel Worksheet Functions 1 April 4th 05 09:29 PM
Plot 2 variables on Y axis nemme Charts and Charting in Excel 1 March 18th 05 08:17 PM
LINEST maximum number of predictor variables Peter N Excel Worksheet Functions 7 February 17th 05 10:14 AM
Problem seting-up Regression Macro Confused VB Person Excel Discussion (Misc queries) 1 February 9th 05 08:05 AM
Index Match With 3 Variables Scooterdog Excel Worksheet Functions 5 January 2nd 05 07:05 PM


All times are GMT +1. The time now is 09:49 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"