Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Subtotals in very large datasets
I have a number of very large datasets of household water consumption which I
want to combine so I have an average reading per house per year (eg I have four spreadsheets per year, for an area (or subset of area, it is a city, and some are VERY large)). Each row has an identifier, LotPlan, most of which are unique, but there are some duplicates (where this is so, they refer to the same parcel, so I can just average or sum the readings with no problem). I have deleted all zero records. What I want to do is run subtotals and average the Avg daily water consumption per LotPlan (some have four readings, some may only have one). However, Excel will not let me subtotal, as it gives me the end of data error, which I figure is related to the size of the spreadsheet. What I was doing was appending the data so I had one worksheet with all the information, identified by LotPlan, and also by reading period; however, I cannot do this either, as then it is more than 65K rows. Are there any ways I can get around these problems in Excel 2003 or 2007? If not, can I do it in Access or SPSS or other software? I have 4Gig RAM, although it is still very slow! Thanks for any help. |
#2
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Subtotals in very large datasets
heather wrote:
I have a number of very large datasets of household water consumption which I want to combine so I have an average reading per house per year (eg I have four spreadsheets per year, for an area (or subset of area, it is a city, and some are VERY large)). Each row has an identifier, LotPlan, most of which are unique, but there are some duplicates (where this is so, they refer to the same parcel, so I can just average or sum the readings with no problem). I have deleted all zero records. What I want to do is run subtotals and average the Avg daily water consumption per LotPlan (some have four readings, some may only have one). However, Excel will not let me subtotal, as it gives me the end of data error, which I figure is related to the size of the spreadsheet. What I was doing was appending the data so I had one worksheet with all the information, identified by LotPlan, and also by reading period; however, I cannot do this either, as then it is more than 65K rows. Are there any ways I can get around these problems in Excel 2003 or 2007? If not, can I do it in Access or SPSS or other software? I have 4Gig RAM, although it is still very slow! Thanks for any help. A few thoughts... In E2007 AFAIK you will not have this problem unless you have up around 1M rows. But, I do not have E2007 so cannot say for sure. In E2003 you might be able to get subtotals adequately by using a Pivot Table. Oh, maybe not since the consolidated data exceeds 64K rows. Were it me hitting the limit, as it were, it would be time to port the data to something more capable. Access is a logical choice. With the consolidated data in one table Access can easily aggregate totals and averages. |
#3
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Subtotals in very large datasets
Thanks, after posting, I did as you said - ported it to Access, and it is
working fine! Still need to figure out some things re averaging in Access, but shouldn't be an issue. (I couldn't do a pivot table in 2003 or 2007 because of the row limitation...) thanks again "smartin" wrote: heather wrote: I have a number of very large datasets of household water consumption which I want to combine so I have an average reading per house per year (eg I have four spreadsheets per year, for an area (or subset of area, it is a city, and some are VERY large)). Each row has an identifier, LotPlan, most of which are unique, but there are some duplicates (where this is so, they refer to the same parcel, so I can just average or sum the readings with no problem). I have deleted all zero records. What I want to do is run subtotals and average the Avg daily water consumption per LotPlan (some have four readings, some may only have one). However, Excel will not let me subtotal, as it gives me the end of data error, which I figure is related to the size of the spreadsheet. What I was doing was appending the data so I had one worksheet with all the information, identified by LotPlan, and also by reading period; however, I cannot do this either, as then it is more than 65K rows. Are there any ways I can get around these problems in Excel 2003 or 2007? If not, can I do it in Access or SPSS or other software? I have 4Gig RAM, although it is still very slow! Thanks for any help. A few thoughts... In E2007 AFAIK you will not have this problem unless you have up around 1M rows. But, I do not have E2007 so cannot say for sure. In E2003 you might be able to get subtotals adequately by using a Pivot Table. Oh, maybe not since the consolidated data exceeds 64K rows. Were it me hitting the limit, as it were, it would be time to port the data to something more capable. Access is a logical choice. With the consolidated data in one table Access can easily aggregate totals and averages. |
#4
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Subtotals in very large datasets
If ou are using Excel 2003, one option you have is to take the data to
Access and then create a pivot in Excel where the data source will be the Access database. -- Regards, Ashish Mathur Microsoft Excel MVP www.ashishmathur.com "heather" wrote in message ... I have a number of very large datasets of household water consumption which I want to combine so I have an average reading per house per year (eg I have four spreadsheets per year, for an area (or subset of area, it is a city, and some are VERY large)). Each row has an identifier, LotPlan, most of which are unique, but there are some duplicates (where this is so, they refer to the same parcel, so I can just average or sum the readings with no problem). I have deleted all zero records. What I want to do is run subtotals and average the Avg daily water consumption per LotPlan (some have four readings, some may only have one). However, Excel will not let me subtotal, as it gives me the end of data error, which I figure is related to the size of the spreadsheet. What I was doing was appending the data so I had one worksheet with all the information, identified by LotPlan, and also by reading period; however, I cannot do this either, as then it is more than 65K rows. Are there any ways I can get around these problems in Excel 2003 or 2007? If not, can I do it in Access or SPSS or other software? I have 4Gig RAM, although it is still very slow! Thanks for any help. |
#5
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Subtotals in very large datasets
You might be there already, but in Access it is as simple as creating a
query on the table, click the summation symbol (fancy 'E' [Sigma] icon), Group By the dimensions (LotPlan, 'reading period'), and Average the value 'water consumption'. heather wrote: Thanks, after posting, I did as you said - ported it to Access, and it is working fine! Still need to figure out some things re averaging in Access, but shouldn't be an issue. (I couldn't do a pivot table in 2003 or 2007 because of the row limitation...) thanks again "smartin" wrote: heather wrote: I have a number of very large datasets of household water consumption which I want to combine so I have an average reading per house per year (eg I have four spreadsheets per year, for an area (or subset of area, it is a city, and some are VERY large)). Each row has an identifier, LotPlan, most of which are unique, but there are some duplicates (where this is so, they refer to the same parcel, so I can just average or sum the readings with no problem). I have deleted all zero records. What I want to do is run subtotals and average the Avg daily water consumption per LotPlan (some have four readings, some may only have one). However, Excel will not let me subtotal, as it gives me the end of data error, which I figure is related to the size of the spreadsheet. What I was doing was appending the data so I had one worksheet with all the information, identified by LotPlan, and also by reading period; however, I cannot do this either, as then it is more than 65K rows. Are there any ways I can get around these problems in Excel 2003 or 2007? If not, can I do it in Access or SPSS or other software? I have 4Gig RAM, although it is still very slow! Thanks for any help. A few thoughts... In E2007 AFAIK you will not have this problem unless you have up around 1M rows. But, I do not have E2007 so cannot say for sure. In E2003 you might be able to get subtotals adequately by using a Pivot Table. Oh, maybe not since the consolidated data exceeds 64K rows. Were it me hitting the limit, as it were, it would be time to port the data to something more capable. Access is a logical choice. With the consolidated data in one table Access can easily aggregate totals and averages. |
#6
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Subtotals in very large datasets
Hi,
A few points: 1. If the data source is greater than 65,535 in 2003 it is best to put the data into Access. (Because the Excel 2003 solution for handling more than 65,536 rows in the source is pretty tricky.) 2. If the data source is greater than 1,048,575 in 2007 (1 row for a header) then again Access is the prefered solution. 3. If the resulting Pivot Table would have more than 65,536 rows then you can still create the pivot table in 2003, but some of it won't be viewable. Of course if you had a page filter then you could hide or display blocks of data, so the fact that you could see it all at one time would not be critical. 4. Similar pont for Excel 2007 pivot tables as #3 but with the greater number of displayable rows. 5. In Access choose the Query object and click New, pick your table from the Add Tables (Show Tables) dialog box. Add the fields you want to see by double-clicking each one. Click the Total button and leave Group By on the Total row under LotPlan, open the drop down on the Total row under the column you want to average and choose Avg. Don't include unnecessary fields, Access will group on them also, although there are workarounds. Note that Access will only display the averages, not the detail and averages; if you want both you should consider a Report. -- If this helps, please click the Yes button. Cheers, Shane Devenshire "heather" wrote: I have a number of very large datasets of household water consumption which I want to combine so I have an average reading per house per year (eg I have four spreadsheets per year, for an area (or subset of area, it is a city, and some are VERY large)). Each row has an identifier, LotPlan, most of which are unique, but there are some duplicates (where this is so, they refer to the same parcel, so I can just average or sum the readings with no problem). I have deleted all zero records. What I want to do is run subtotals and average the Avg daily water consumption per LotPlan (some have four readings, some may only have one). However, Excel will not let me subtotal, as it gives me the end of data error, which I figure is related to the size of the spreadsheet. What I was doing was appending the data so I had one worksheet with all the information, identified by LotPlan, and also by reading period; however, I cannot do this either, as then it is more than 65K rows. Are there any ways I can get around these problems in Excel 2003 or 2007? If not, can I do it in Access or SPSS or other software? I have 4Gig RAM, although it is still very slow! Thanks for any help. |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
sorting datasets on two rows | Excel Worksheet Functions | |||
most efficient way to filter out and iterate through datasets in X | Excel Discussion (Misc queries) | |||
Manipulating subsets of large datasets | Excel Worksheet Functions | |||
charts macro for a number of datasets | Charts and Charting in Excel | |||
LARGE Function with subtotals | Excel Discussion (Misc queries) |