Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 148
Default Subtotals in very large datasets

I have a number of very large datasets of household water consumption which I
want to combine so I have an average reading per house per year (eg I have
four spreadsheets per year, for an area (or subset of area, it is a city, and
some are VERY large)). Each row has an identifier, LotPlan, most of which are
unique, but there are some duplicates (where this is so, they refer to the
same parcel, so I can just average or sum the readings with no problem). I
have deleted all zero records.

What I want to do is run subtotals and average the Avg daily water
consumption per LotPlan (some have four readings, some may only have one).
However, Excel will not let me subtotal, as it gives me the end of data
error, which I figure is related to the size of the spreadsheet. What I was
doing was appending the data so I had one worksheet with all the information,
identified by LotPlan, and also by reading period; however, I cannot do this
either, as then it is more than 65K rows. Are there any ways I can get around
these problems in Excel 2003 or 2007? If not, can I do it in Access or SPSS
or other software? I have 4Gig RAM, although it is still very slow! Thanks
for any help.
  #2   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 915
Default Subtotals in very large datasets

heather wrote:
I have a number of very large datasets of household water consumption which I
want to combine so I have an average reading per house per year (eg I have
four spreadsheets per year, for an area (or subset of area, it is a city, and
some are VERY large)). Each row has an identifier, LotPlan, most of which are
unique, but there are some duplicates (where this is so, they refer to the
same parcel, so I can just average or sum the readings with no problem). I
have deleted all zero records.

What I want to do is run subtotals and average the Avg daily water
consumption per LotPlan (some have four readings, some may only have one).
However, Excel will not let me subtotal, as it gives me the end of data
error, which I figure is related to the size of the spreadsheet. What I was
doing was appending the data so I had one worksheet with all the information,
identified by LotPlan, and also by reading period; however, I cannot do this
either, as then it is more than 65K rows. Are there any ways I can get around
these problems in Excel 2003 or 2007? If not, can I do it in Access or SPSS
or other software? I have 4Gig RAM, although it is still very slow! Thanks
for any help.


A few thoughts...

In E2007 AFAIK you will not have this problem unless you have up around
1M rows. But, I do not have E2007 so cannot say for sure.

In E2003 you might be able to get subtotals adequately by using a Pivot
Table. Oh, maybe not since the consolidated data exceeds 64K rows.

Were it me hitting the limit, as it were, it would be time to port the
data to something more capable. Access is a logical choice. With the
consolidated data in one table Access can easily aggregate totals and
averages.

  #3   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 148
Default Subtotals in very large datasets

Thanks, after posting, I did as you said - ported it to Access, and it is
working fine! Still need to figure out some things re averaging in Access,
but shouldn't be an issue. (I couldn't do a pivot table in 2003 or 2007
because of the row limitation...)
thanks again

"smartin" wrote:

heather wrote:
I have a number of very large datasets of household water consumption which I
want to combine so I have an average reading per house per year (eg I have
four spreadsheets per year, for an area (or subset of area, it is a city, and
some are VERY large)). Each row has an identifier, LotPlan, most of which are
unique, but there are some duplicates (where this is so, they refer to the
same parcel, so I can just average or sum the readings with no problem). I
have deleted all zero records.

What I want to do is run subtotals and average the Avg daily water
consumption per LotPlan (some have four readings, some may only have one).
However, Excel will not let me subtotal, as it gives me the end of data
error, which I figure is related to the size of the spreadsheet. What I was
doing was appending the data so I had one worksheet with all the information,
identified by LotPlan, and also by reading period; however, I cannot do this
either, as then it is more than 65K rows. Are there any ways I can get around
these problems in Excel 2003 or 2007? If not, can I do it in Access or SPSS
or other software? I have 4Gig RAM, although it is still very slow! Thanks
for any help.


A few thoughts...

In E2007 AFAIK you will not have this problem unless you have up around
1M rows. But, I do not have E2007 so cannot say for sure.

In E2003 you might be able to get subtotals adequately by using a Pivot
Table. Oh, maybe not since the consolidated data exceeds 64K rows.

Were it me hitting the limit, as it were, it would be time to port the
data to something more capable. Access is a logical choice. With the
consolidated data in one table Access can easily aggregate totals and
averages.


  #4   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 1,766
Default Subtotals in very large datasets

If ou are using Excel 2003, one option you have is to take the data to
Access and then create a pivot in Excel where the data source will be the
Access database.

--
Regards,

Ashish Mathur
Microsoft Excel MVP
www.ashishmathur.com

"heather" wrote in message
...
I have a number of very large datasets of household water consumption
which I
want to combine so I have an average reading per house per year (eg I have
four spreadsheets per year, for an area (or subset of area, it is a city,
and
some are VERY large)). Each row has an identifier, LotPlan, most of which
are
unique, but there are some duplicates (where this is so, they refer to the
same parcel, so I can just average or sum the readings with no problem). I
have deleted all zero records.

What I want to do is run subtotals and average the Avg daily water
consumption per LotPlan (some have four readings, some may only have one).
However, Excel will not let me subtotal, as it gives me the end of data
error, which I figure is related to the size of the spreadsheet. What I
was
doing was appending the data so I had one worksheet with all the
information,
identified by LotPlan, and also by reading period; however, I cannot do
this
either, as then it is more than 65K rows. Are there any ways I can get
around
these problems in Excel 2003 or 2007? If not, can I do it in Access or
SPSS
or other software? I have 4Gig RAM, although it is still very slow! Thanks
for any help.


  #5   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 915
Default Subtotals in very large datasets

You might be there already, but in Access it is as simple as creating a
query on the table, click the summation symbol (fancy 'E' [Sigma] icon),
Group By the dimensions (LotPlan, 'reading period'), and Average the
value 'water consumption'.

heather wrote:
Thanks, after posting, I did as you said - ported it to Access, and it is
working fine! Still need to figure out some things re averaging in Access,
but shouldn't be an issue. (I couldn't do a pivot table in 2003 or 2007
because of the row limitation...)
thanks again

"smartin" wrote:

heather wrote:
I have a number of very large datasets of household water consumption which I
want to combine so I have an average reading per house per year (eg I have
four spreadsheets per year, for an area (or subset of area, it is a city, and
some are VERY large)). Each row has an identifier, LotPlan, most of which are
unique, but there are some duplicates (where this is so, they refer to the
same parcel, so I can just average or sum the readings with no problem). I
have deleted all zero records.

What I want to do is run subtotals and average the Avg daily water
consumption per LotPlan (some have four readings, some may only have one).
However, Excel will not let me subtotal, as it gives me the end of data
error, which I figure is related to the size of the spreadsheet. What I was
doing was appending the data so I had one worksheet with all the information,
identified by LotPlan, and also by reading period; however, I cannot do this
either, as then it is more than 65K rows. Are there any ways I can get around
these problems in Excel 2003 or 2007? If not, can I do it in Access or SPSS
or other software? I have 4Gig RAM, although it is still very slow! Thanks
for any help.

A few thoughts...

In E2007 AFAIK you will not have this problem unless you have up around
1M rows. But, I do not have E2007 so cannot say for sure.

In E2003 you might be able to get subtotals adequately by using a Pivot
Table. Oh, maybe not since the consolidated data exceeds 64K rows.

Were it me hitting the limit, as it were, it would be time to port the
data to something more capable. Access is a logical choice. With the
consolidated data in one table Access can easily aggregate totals and
averages.




  #6   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 3,346
Default Subtotals in very large datasets

Hi,

A few points:

1. If the data source is greater than 65,535 in 2003 it is best to put the
data into Access. (Because the Excel 2003 solution for handling more than
65,536 rows in the source is pretty tricky.)
2. If the data source is greater than 1,048,575 in 2007 (1 row for a
header) then again Access is the prefered solution.
3. If the resulting Pivot Table would have more than 65,536 rows then you
can still create the pivot table in 2003, but some of it won't be viewable.
Of course if you had a page filter then you could hide or display blocks of
data, so the fact that you could see it all at one time would not be critical.
4. Similar pont for Excel 2007 pivot tables as #3 but with the greater
number of displayable rows.
5. In Access choose the Query object and click New, pick your table from
the Add Tables (Show Tables) dialog box. Add the fields you want to see by
double-clicking each one. Click the Total button and leave Group By on the
Total row under LotPlan, open the drop down on the Total row under the column
you want to average and choose Avg. Don't include unnecessary fields, Access
will group on them also, although there are workarounds. Note that Access
will only display the averages, not the detail and averages; if you want both
you should consider a Report.

--
If this helps, please click the Yes button.

Cheers,
Shane Devenshire


"heather" wrote:

I have a number of very large datasets of household water consumption which I
want to combine so I have an average reading per house per year (eg I have
four spreadsheets per year, for an area (or subset of area, it is a city, and
some are VERY large)). Each row has an identifier, LotPlan, most of which are
unique, but there are some duplicates (where this is so, they refer to the
same parcel, so I can just average or sum the readings with no problem). I
have deleted all zero records.

What I want to do is run subtotals and average the Avg daily water
consumption per LotPlan (some have four readings, some may only have one).
However, Excel will not let me subtotal, as it gives me the end of data
error, which I figure is related to the size of the spreadsheet. What I was
doing was appending the data so I had one worksheet with all the information,
identified by LotPlan, and also by reading period; however, I cannot do this
either, as then it is more than 65K rows. Are there any ways I can get around
these problems in Excel 2003 or 2007? If not, can I do it in Access or SPSS
or other software? I have 4Gig RAM, although it is still very slow! Thanks
for any help.

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
sorting datasets on two rows anyole Excel Worksheet Functions 4 July 1st 08 07:31 PM
most efficient way to filter out and iterate through datasets in X Bart Wouters[_2_] Excel Discussion (Misc queries) 1 February 12th 08 03:54 PM
Manipulating subsets of large datasets Astrofin Excel Worksheet Functions 4 March 7th 07 05:11 PM
charts macro for a number of datasets Nathan D Charts and Charting in Excel 1 February 23rd 07 02:58 AM
LARGE Function with subtotals hindsight Excel Discussion (Misc queries) 3 August 3rd 05 03:09 PM


All times are GMT +1. The time now is 01:25 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"