Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 3
Default Multiple (as in many) regressions

Hi all

Does anyone have a quick routine for running multiple regressions over a
range of files in Excel?

I have a whole load of .xls files. Each contain data in columns D and E.
D is always the independent (x axis) and E is always the dependent (y axis)
but each file will have a different number of rows. I've never been too
fond of .xlDown in VBA and a clumsy routine to count until the first blank
row would take a while with over 5,000 files!

I need a routine to open up each file, feed the ranges in D and E into the
Regression routine in the Data Analysis toolpak, create the outputs but
(crucially) dump the output somewhere. With 5,000 files, this means 5,000
regressions and therefore 5,000 worksheets in the results workbook. I'm
not sure what the limit is nowadays - used to be 256 sheets per book when I
last got anywhere near it.

What I really need is a table showing (for each regression) what the
"important" results were i.e. the intercept, the coefficient and the
associated goodness of fit.

If anyone can help, that would be great.

Thx
lk

  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 3,514
Default Multiple (as in many) regressions

lk has brought this to us :
Hi all

Does anyone have a quick routine for running multiple regressions over a
range of files in Excel?

I have a whole load of .xls files. Each contain data in columns D and E. D
is always the independent (x axis) and E is always the dependent (y axis) but
each file will have a different number of rows. I've never been too fond of
.xlDown in VBA and a clumsy routine to count until the first blank row would
take a while with over 5,000 files!

I need a routine to open up each file, feed the ranges in D and E into the
Regression routine in the Data Analysis toolpak, create the outputs but
(crucially) dump the output somewhere. With 5,000 files, this means 5,000
regressions and therefore 5,000 worksheets in the results workbook. I'm not
sure what the limit is nowadays - used to be 256 sheets per book when I last
got anywhere near it.

What I really need is a table showing (for each regression) what the
"important" results were i.e. the intercept, the coefficient and the
associated goodness of fit.

If anyone can help, that would be great.

Thx
lk


Firstly, to find the last row of data you should start at the last cell
of a column and use .End(xlUp).Row to find the last entry in the column
regardless if the data is contiguous.

Secondly, I don't recommend you open over 5000 Excel workbooks to just
grab the contents of Cols D:E. You might want to consider using ADODB
to grab the data (from closed workbooks) into a recordset that you can
manipulate however you like. Examples of how to do this can be
downloaded here...

http://www.appspro.com/conference/Da...rogramming.zip

I'm not sure why anyone would store output data in an Excel file as a
storage container. Normally, outputs are written to DAT, TXT, or CSV
files so the data can be utilized by any program via its normal file
I/O functions. Excel files take up way more storage space than plain
text files and so is not an efficient format for any program to output
raw data to. Not saying exporting to Excel is a bad practice for
analysis purposes, just that it's not an efficient way to store data
for use by other software using that data. So.., if the files actually
are CSVs then the specific data you need to pass to Analysis Toolpak
functions can be manipulated fairly easily using VB's normal file I/O
functions to load the data into an array where it can also be
easily/quickly parsed as needed for further use. Since all of this
happens in memory (as apposed to physically opening each file) it's
blazingly faster than working with Excel workbooks and using ADODB to
grab data from closed workbooks.

--
Garry

Free usenet access at http://www.eternal-september.org
ClassicVB Users Regroup! comp.lang.basic.visual.misc


  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 3
Default Multiple (as in many) regressions



"GS" wrote in message ...

lk has brought this to us :
Hi all

Does anyone have a quick routine for running multiple regressions over a
range of files in Excel?

I have a whole load of .xls files. Each contain data in columns D and E.
D is always the independent (x axis) and E is always the dependent (y
axis) but each file will have a different number of rows. I've never
been too fond of .xlDown in VBA and a clumsy routine to count until the
first blank row would take a while with over 5,000 files!

I need a routine to open up each file, feed the ranges in D and E into the
Regression routine in the Data Analysis toolpak, create the outputs but
(crucially) dump the output somewhere. With 5,000 files, this means
5,000 regressions and therefore 5,000 worksheets in the results workbook.
I'm not sure what the limit is nowadays - used to be 256 sheets per book
when I last got anywhere near it.

What I really need is a table showing (for each regression) what the
"important" results were i.e. the intercept, the coefficient and the
associated goodness of fit.

If anyone can help, that would be great.

Thx
lk


Firstly, to find the last row of data you should start at the last cell
of a column and use .End(xlUp).Row to find the last entry in the column
regardless if the data is contiguous.

Secondly, I don't recommend you open over 5000 Excel workbooks to just
grab the contents of Cols D:E. You might want to consider using ADODB
to grab the data (from closed workbooks) into a recordset that you can
manipulate however you like. Examples of how to do this can be
downloaded here...

http://www.appspro.com/conference/Da...rogramming.zip

I'm not sure why anyone would store output data in an Excel file as a
storage container. Normally, outputs are written to DAT, TXT, or CSV
files so the data can be utilized by any program via its normal file
I/O functions. Excel files take up way more storage space than plain
text files and so is not an efficient format for any program to output
raw data to. Not saying exporting to Excel is a bad practice for
analysis purposes, just that it's not an efficient way to store data
for use by other software using that data. So.., if the files actually
are CSVs then the specific data you need to pass to Analysis Toolpak
functions can be manipulated fairly easily using VB's normal file I/O
functions to load the data into an array where it can also be
easily/quickly parsed as needed for further use. Since all of this
happens in memory (as apposed to physically opening each file) it's
blazingly faster than working with Excel workbooks and using ADODB to
grab data from closed workbooks.

--
Garry

Free usenet access at http://www.eternal-september.org
ClassicVB Users Regroup! comp.lang.basic.visual.misc


If it's any help - I have the whole dataset in a 15,000,000 row csv file.
I'm quite happy to manipulate if from there
but couldn't see how to get Excel to read chunks of it, analyse that chunk
and then move on to the next chunk.

  #4   Report Post  
Posted to microsoft.public.excel.programming
Xt Xt is offline
external usenet poster
 
Posts: 49
Default Multiple (as in many) regressions

On Dec 23, 2:58*am, "lk" wrote:
Hi all

Does anyone have a quick routine for running multiple regressions over a
range of files in Excel?

I have a whole load of .xls files. * Each contain data in columns D and E.
D is always the independent (x axis) and E is always the dependent (y axis)
but each file will have a different number of rows. * I've never been too
fond of .xlDown in VBA and a clumsy routine to count until the first blank
row would take a while with over 5,000 files!

I need a routine to open up each file, feed the ranges in D and E into the
Regression routine in the Data Analysis toolpak, create the outputs but
(crucially) dump the output somewhere. * With 5,000 files, this means 5,000
regressions and therefore 5,000 worksheets in the results workbook. * I'm
not sure what the limit is nowadays - used to be 256 sheets per book when I
last got anywhere near it.

What I really need is a table showing (for each regression) what the
"important" results were i.e. the intercept, the coefficient and the
associated goodness of fit.

If anyone can help, that would be great.

Thx
lk


If it makes things easier, you don't need to use the Data Analysis
Toolpack. You can use the inbuilt Excel functions SLOPE and INTERCEPT
over whole columns.

In a spare spot you can put something like
Range("Z1") = "=SLOPE(E:E,D:D)"
Range("Z2") = "=INTERCEPT(E:E,D:D)" in your VBA. The when the
file is opened, it copies the formulas into Z1:Z2 and the slope and
intercept appear, ready to be copied off somewhere else. It's
probably quicker too.

xt


Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Regressions with data in (changing) rows Ying-Foon Chow Excel Discussion (Misc queries) 2 November 17th 07 08:17 PM
Regressions using dates petess Excel Worksheet Functions 2 September 16th 07 03:08 PM
running regressions in Excel 2003 Naraine Ramkirath Excel Worksheet Functions 5 May 15th 07 01:49 PM
I used Lotus Help for regressions in MS Excel 99. Now what? Hawkeye Excel Worksheet Functions 1 February 1st 06 09:25 PM
Creating regressions with more than 17 data points... Tom Ogilvy Excel Programming 3 December 23rd 03 03:20 AM


All times are GMT +1. The time now is 11:14 PM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"