Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Manipulating subsets of large datasets
I use Excel for manipulating and analysing scientific data. I find myself
repeatedly performing similar tasks, and I am sure there must be an easier way to do this. I would welcome any suggestions. There are really two distinct problems. First, datasets of 20-30 columns, 1000s of rows. I want to perform calculations on multiple subsets within this data. For example a countif with criteria from two columns. Currently I sort the data and manually set the data range of the expressions according to one of the criteria. I'm sure I am being slow, but is there an easier way to base this function on multiple criteria? Secondly, one of the criteria is month of data collection. This data is currently stored as a dd/mm/yyyy date data type. Is there an easy way to make Excel recognise the month without constructing a complicated function? I am using greater and less than functions, but this gets very involved when dealing with data that covers a number of years. Is there an easy way to make Excel recognise the month portion of a date string? I'm happy to provide more information, I may not have explained this very well. I would be grateful for any suggestions or insights into how people approach these sort of issues. I'm using excel 2003. Thanks in anticipation. |
#2
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Manipulating subsets of large datasets
You don't sound like a "beginner", so check out the Sumproduct() function as
explained very lucidly by Bob Philips on this web page: http://www.xldynamic.com/source/xld.SUMPRODUCT.html The XL Help files are completely inept as far as describing the scope of this function. Post back with any questions. -- HTH, RD --------------------------------------------------------------------------- Please keep all correspondence within the NewsGroup, so all may benefit ! --------------------------------------------------------------------------- "Astrofin" wrote in message ... I use Excel for manipulating and analysing scientific data. I find myself repeatedly performing similar tasks, and I am sure there must be an easier way to do this. I would welcome any suggestions. There are really two distinct problems. First, datasets of 20-30 columns, 1000s of rows. I want to perform calculations on multiple subsets within this data. For example a countif with criteria from two columns. Currently I sort the data and manually set the data range of the expressions according to one of the criteria. I'm sure I am being slow, but is there an easier way to base this function on multiple criteria? Secondly, one of the criteria is month of data collection. This data is currently stored as a dd/mm/yyyy date data type. Is there an easy way to make Excel recognise the month without constructing a complicated function? I am using greater and less than functions, but this gets very involved when dealing with data that covers a number of years. Is there an easy way to make Excel recognise the month portion of a date string? I'm happy to provide more information, I may not have explained this very well. I would be grateful for any suggestions or insights into how people approach these sort of issues. I'm using excel 2003. Thanks in anticipation. |
#3
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Manipulating subsets of large datasets
You might want to look at storing your data in a database rather than a
spreadsheet. What you are asking is trivial using SQL. MH "Astrofin" wrote in message ... I use Excel for manipulating and analysing scientific data. I find myself repeatedly performing similar tasks, and I am sure there must be an easier way to do this. I would welcome any suggestions. There are really two distinct problems. First, datasets of 20-30 columns, 1000s of rows. I want to perform calculations on multiple subsets within this data. For example a countif with criteria from two columns. Currently I sort the data and manually set the data range of the expressions according to one of the criteria. I'm sure I am being slow, but is there an easier way to base this function on multiple criteria? Secondly, one of the criteria is month of data collection. This data is currently stored as a dd/mm/yyyy date data type. Is there an easy way to make Excel recognise the month without constructing a complicated function? I am using greater and less than functions, but this gets very involved when dealing with data that covers a number of years. Is there an easy way to make Excel recognise the month portion of a date string? I'm happy to provide more information, I may not have explained this very well. I would be grateful for any suggestions or insights into how people approach these sort of issues. I'm using excel 2003. Thanks in anticipation. |
#4
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Manipulating subsets of large datasets
Ragdyer:
Sumproduct is just what I was looking for. What an elegant solution! Thank you. MH: The data is exported from a sql database. The database is currently under development and doesn't provide the calculation and graphing functions that I require. Yet! I should probably start a new thread for this, but I'm still stumped by the date issue. Dates are stored as a dd/mm/yyyy date string. How can I filter out dates within a particular month for use within calculations, without constructing complicated if and if< functions. Not only does this become unwieldy, but is difficult when data covers multiple years. Thank you for your help and suggestions. |
#5
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Manipulating subsets of large datasets
If I understand what you're looking for, the easiest way to return data
that's bracketed between dates is to assign 2 cells to contain those limitation dates, whether those restrictions be days, months, or years. For example, with A1 containing the "start" date, and A2 the "ending" date, and item colors listed in Column C, and item weights in Column D, and the dates these items were "used" listed in Column B ... you're looking for a count of all red items of less then 10 pounds that were used between June 1, 2006 and August 15, 2007: A1 = 6/1/06 A2 = 8/15/07 (my date formats) =Sumproduct((B2:B1000=A1)*(B2:B1000<=A2)*(C2:C100 0="Red")*(D2:D1000<10)) Needless to say, the criteria of "Red" and 10 could *also* be referenced in other cells, to eliminate the need to revise the formula itself. Just to add *another variable*, say you *only* wanted to count the red items of less then 10 lbs. that were used in the month of February, between 2005 and 2007, you might try something like this: A1 = 2/1/05 A2 = 3/1/07 (so you don't have to remember if it was a leap year or not) =SUMPRODUCT((B2:B1000=A1)*(B2:B1000<=A2)*(MONTH(B 2:B1000)=2)*(C2:C1000="Red")*(D2:D1000<10)) And of course, the month number 2 could have been assigned to a cell. -- HTH, RD ================================================== === Please keep all correspondence within the Group, so all may benefit! ================================================== === "Astrofin" wrote in message ... Ragdyer: Sumproduct is just what I was looking for. What an elegant solution! Thank you. MH: The data is exported from a sql database. The database is currently under development and doesn't provide the calculation and graphing functions that I require. Yet! I should probably start a new thread for this, but I'm still stumped by the date issue. Dates are stored as a dd/mm/yyyy date string. How can I filter out dates within a particular month for use within calculations, without constructing complicated if and if< functions. Not only does this become unwieldy, but is difficult when data covers multiple years. Thank you for your help and suggestions. |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
charts macro for a number of datasets | Charts and Charting in Excel | |||
Xcl-create subsets of a set of numbers w/o duplicating the subset | Excel Worksheet Functions | |||
match two columns (with equal dates) and their attached datasets | Excel Discussion (Misc queries) | |||
XL2K: Z origin (floor) on column charts defaults to non-zero for certain datasets | Charts and Charting in Excel | |||
Manipulating large worksheets and hardware capabilities | Excel Discussion (Misc queries) |