![]() |
Analyzing data in collections of classes?
I have a large data set that I need to analyze based on several factors, and
I'm thinking this /might/ be the right time to dip my toe into OOP (or as much as VBA allows) instead of my standard linear approach. So far, the most understandable resource I've found is http://www.cpearson.com/excel/Classes.aspx but I'm still having trouble conceptualizing how to set it up (and if the actual analyses I need to perform would benefit from an OOP approach). My source data represents many thousands of purchases, within each purchase there will be multiple products, and within each product there will be multiple quantities, packaging, etc. It is possible that the same product will show up in the same order more than once, with different properties (e.g. milk.size = 1 gallon, and milk.size = 1 pint). So I might have instances of purchases: purchase.date purchase.time purchase.amount purchase.payment_type purchase.membercard_number Then within each purchase, I have instances of items: Item.type 'veggie/dairy/produce/bulk/dry goods/pets/etc Item.brand item.package_size Item.quantity Item.onsale 'boolean First question: In this case, purchase is more than a flat collection, because it has it's own class properties, but also needs to function as a collection of items. What is the proper approach to set that up? Once I have my data loaded from my worksheet into the objects, my analysis will be to look for specific patterns; What was the total amount of vanilla ice cream purchased from Brand_C as compared to vanilla ice cream purchased from Brand_X, excluding purchases where Brand_X was purchased in the same transaction as any items from Brand_Z? I'm thinking I could just create additional properties for each transaction to simplify the analysis, for example, purchase.IncludesBrandXVanillaIceCream_but_noBrand Z 'boolean purchase.TotalDollarsOfBrandXVanillaIceCream Does OOP (or the VBA equivalent) really make these types of analyses easier, or should I just stick with linear data crunching? I appreciate your advice, Keith |
Analyzing data in collections of classes?
Ker writes: "Once I have my data loaded from my worksheet into the
objects, my analysis will be to look for specific patterns; What was the total amount of vanilla ice cream purchased from Brand_C as compared to vanilla ice cream purchased from Brand_X, excluding purchases where Brand_X was purchased in the same transaction as any items from Brand_Z? " Don't load the data into the worksheet. If it's not loaded into a database already, load it into a database. Let the database handle the queries like this one. |
Analyzing data in collections of classes?
Agree with gimme here. Can't see much point to move this to an OOP approach.
You need a database, which will make this a lot simpler. It will depend on your particular requirements, but I would go with SQLite. RBS "ker_01" wrote in message ... I have a large data set that I need to analyze based on several factors, and I'm thinking this /might/ be the right time to dip my toe into OOP (or as much as VBA allows) instead of my standard linear approach. So far, the most understandable resource I've found is http://www.cpearson.com/excel/Classes.aspx but I'm still having trouble conceptualizing how to set it up (and if the actual analyses I need to perform would benefit from an OOP approach). My source data represents many thousands of purchases, within each purchase there will be multiple products, and within each product there will be multiple quantities, packaging, etc. It is possible that the same product will show up in the same order more than once, with different properties (e.g. milk.size = 1 gallon, and milk.size = 1 pint). So I might have instances of purchases: purchase.date purchase.time purchase.amount purchase.payment_type purchase.membercard_number Then within each purchase, I have instances of items: Item.type 'veggie/dairy/produce/bulk/dry goods/pets/etc Item.brand item.package_size Item.quantity Item.onsale 'boolean First question: In this case, purchase is more than a flat collection, because it has it's own class properties, but also needs to function as a collection of items. What is the proper approach to set that up? Once I have my data loaded from my worksheet into the objects, my analysis will be to look for specific patterns; What was the total amount of vanilla ice cream purchased from Brand_C as compared to vanilla ice cream purchased from Brand_X, excluding purchases where Brand_X was purchased in the same transaction as any items from Brand_Z? I'm thinking I could just create additional properties for each transaction to simplify the analysis, for example, purchase.IncludesBrandXVanillaIceCream_but_noBrand Z 'boolean purchase.TotalDollarsOfBrandXVanillaIceCream Does OOP (or the VBA equivalent) really make these types of analyses easier, or should I just stick with linear data crunching? I appreciate your advice, Keith |
Analyzing data in collections of classes?
Thank you both for your replies.
The data will already be coming to me in Excel, so I'll look at options to push it back into a database. Thanks, Keith "RB Smissaert" wrote: Agree with gimme here. Can't see much point to move this to an OOP approach. You need a database, which will make this a lot simpler. It will depend on your particular requirements, but I would go with SQLite. RBS "ker_01" wrote in message ... I have a large data set that I need to analyze based on several factors, and I'm thinking this /might/ be the right time to dip my toe into OOP (or as much as VBA allows) instead of my standard linear approach. So far, the most understandable resource I've found is http://www.cpearson.com/excel/Classes.aspx but I'm still having trouble conceptualizing how to set it up (and if the actual analyses I need to perform would benefit from an OOP approach). My source data represents many thousands of purchases, within each purchase there will be multiple products, and within each product there will be multiple quantities, packaging, etc. It is possible that the same product will show up in the same order more than once, with different properties (e.g. milk.size = 1 gallon, and milk.size = 1 pint). So I might have instances of purchases: purchase.date purchase.time purchase.amount purchase.payment_type purchase.membercard_number Then within each purchase, I have instances of items: Item.type 'veggie/dairy/produce/bulk/dry goods/pets/etc Item.brand item.package_size Item.quantity Item.onsale 'boolean First question: In this case, purchase is more than a flat collection, because it has it's own class properties, but also needs to function as a collection of items. What is the proper approach to set that up? Once I have my data loaded from my worksheet into the objects, my analysis will be to look for specific patterns; What was the total amount of vanilla ice cream purchased from Brand_C as compared to vanilla ice cream purchased from Brand_X, excluding purchases where Brand_X was purchased in the same transaction as any items from Brand_Z? I'm thinking I could just create additional properties for each transaction to simplify the analysis, for example, purchase.IncludesBrandXVanillaIceCream_but_noBrand Z 'boolean purchase.TotalDollarsOfBrandXVanillaIceCream Does OOP (or the VBA equivalent) really make these types of analyses easier, or should I just stick with linear data crunching? I appreciate your advice, Keith |
All times are GMT +1. The time now is 07:41 PM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com