![]() |
Group and Count Text Data More Efficiently
Good afternoon. I frequently need to group and count text data, and I'm
hoping some clever person out here knows of an easier/more robust/more elegant way to do this. I have one or more columns of "free text," which needs to be collated to a manageable number of groups. For example, the input may have entries such as "Price," "Cost," "Pricing," etc. These should all be grouped into "Price," and counted. This is what I've been using. I consolidate the column(s) on a separate worksheet, then use Advanced Filter to get unique values. These go in column A. In column B I have COUNTIF functions to see how many times each original value appears. In column C I type the category it goes into. In column D I have a SUMIF, which sums column B where the value in C equals this category. This function is nested in an IF and MATCH so that each category is only counted once. This is what it looks like: A B C D Aardvark 1 Mammal 13 Bear 2 Mammal Cod 5 Fish 8 Dog 10 Mammal Eel 3 Fish Fly 7 Insect 7 The formulae in column D return null when the category has already been displayed, which is why each category only appears once. To chart this data, I copy columns C and D, paste their values, and sort descending by D. Alternately, I'll copy and paste the categories and use VLOOKUPs to find the total for each category; this doesn't have to be updated if the raw data changes, as long as there are no new values. It works, but is there a better way? Especially, is there some way that would let me sort this block of data, so I don't need to copy values to a new location? My boss has a slightly different solution with DSUMs, but it's fragile and to my thinking even less elegant. One idea I had would be to add a tiny fraction to the values in D based on the text value in C. This would make each value in D unique, so I could write a separate lookup table using LARGE and MATCH, and not have to update it. Does this make sense? Thanks for any opinions! |
Group and Count Text Data More Efficiently
Maybe you could create a lookup table and assign codes...
then have your original table lookup the codes... then manipulate the information to use your sumif or countif functions... or you can just add columns for your various categories and use if statements to populate those columns when there is a match between the cell and the column header... -- Ted "JonOfAllTrades" wrote: Good afternoon. I frequently need to group and count text data, and I'm hoping some clever person out here knows of an easier/more robust/more elegant way to do this. I have one or more columns of "free text," which needs to be collated to a manageable number of groups. For example, the input may have entries such as "Price," "Cost," "Pricing," etc. These should all be grouped into "Price," and counted. This is what I've been using. I consolidate the column(s) on a separate worksheet, then use Advanced Filter to get unique values. These go in column A. In column B I have COUNTIF functions to see how many times each original value appears. In column C I type the category it goes into. In column D I have a SUMIF, which sums column B where the value in C equals this category. This function is nested in an IF and MATCH so that each category is only counted once. This is what it looks like: A B C D Aardvark 1 Mammal 13 Bear 2 Mammal Cod 5 Fish 8 Dog 10 Mammal Eel 3 Fish Fly 7 Insect 7 The formulae in column D return null when the category has already been displayed, which is why each category only appears once. To chart this data, I copy columns C and D, paste their values, and sort descending by D. Alternately, I'll copy and paste the categories and use VLOOKUPs to find the total for each category; this doesn't have to be updated if the raw data changes, as long as there are no new values. It works, but is there a better way? Especially, is there some way that would let me sort this block of data, so I don't need to copy values to a new location? My boss has a slightly different solution with DSUMs, but it's fragile and to my thinking even less elegant. One idea I had would be to add a tiny fraction to the values in D based on the text value in C. This would make each value in D unique, so I could write a separate lookup table using LARGE and MATCH, and not have to update it. Does this make sense? Thanks for any opinions! |
All times are GMT +1. The time now is 09:50 PM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com