Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
tagging unique items in a list
Before posting this message I read through all the posts for finding and
counting unique entries and tried most of the responses and still have not found a solution that works for my situation. So here goes: I have a spreadsheet with 700+ entries. The records I need to sort on look like this: CMMI2_CM_GP2_08_G CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H CMMI2_CM_GP2_10_P_ND_PP_H In the example above the duplicates are CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H So when I've tried using Advanced Filter, the two that look exactly the same are filtered out, but the one with the "PD_PP_H" extension is not. [No, that extension cannot be filtered out ahead of time and yes, it is important to the data set]. Now what? Thanks in advance for any suggestions and help. |
#2
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
tagging unique items in a list
Correct me if I'm wrong here, but it appears that you are not interested in
finding duplicate entries, but only those entries that are duplicate for the first specific number of characters. In your examples that's 17 characters. In other words, not counting what you refer to as the extension. Is that right? You would need VBA for that but you don't say what you want done with entries that are found to be duplicates (for the first 17 characters). Do you want duplicates highlighted? Moved? Copied? All the entries that are duplicates or do you want to leave one? HTH Otto "K. Gwynn" wrote in message ... Before posting this message I read through all the posts for finding and counting unique entries and tried most of the responses and still have not found a solution that works for my situation. So here goes: I have a spreadsheet with 700+ entries. The records I need to sort on look like this: CMMI2_CM_GP2_08_G CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H CMMI2_CM_GP2_10_P_ND_PP_H In the example above the duplicates are CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H So when I've tried using Advanced Filter, the two that look exactly the same are filtered out, but the one with the "PD_PP_H" extension is not. [No, that extension cannot be filtered out ahead of time and yes, it is important to the data set]. Now what? Thanks in advance for any suggestions and help. |
#3
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
tagging unique items in a list
Yes, I am interested in only finding the entries that are unique. But its not
always the same number of characters. The second set of characters is sometimes as many as four characters long. The stuff at the end could be as short as 3 characters or as long as 5 extra characters. Once identified, I need to have the unique entries moved to another column so I can work just with them. "Otto Moehrbach" wrote: Correct me if I'm wrong here, but it appears that you are not interested in finding duplicate entries, but only those entries that are duplicate for the first specific number of characters. In your examples that's 17 characters. In other words, not counting what you refer to as the extension. Is that right? You would need VBA for that but you don't say what you want done with entries that are found to be duplicates (for the first 17 characters). Do you want duplicates highlighted? Moved? Copied? All the entries that are duplicates or do you want to leave one? HTH Otto "K. Gwynn" wrote in message ... Before posting this message I read through all the posts for finding and counting unique entries and tried most of the responses and still have not found a solution that works for my situation. So here goes: I have a spreadsheet with 700+ entries. The records I need to sort on look like this: CMMI2_CM_GP2_08_G CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H CMMI2_CM_GP2_10_P_ND_PP_H In the example above the duplicates are CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H So when I've tried using Advanced Filter, the two that look exactly the same are filtered out, but the one with the "PD_PP_H" extension is not. [No, that extension cannot be filtered out ahead of time and yes, it is important to the data set]. Now what? Thanks in advance for any suggestions and help. |
#4
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
tagging unique items in a list
You say the second set of characters can have different lengths. What about
the first set, the set of 17 characters in your example? Also, you say you want the unique entries "moved". Not copied but moved? Otto "K. Gwynn" wrote in message ... Yes, I am interested in only finding the entries that are unique. But its not always the same number of characters. The second set of characters is sometimes as many as four characters long. The stuff at the end could be as short as 3 characters or as long as 5 extra characters. Once identified, I need to have the unique entries moved to another column so I can work just with them. "Otto Moehrbach" wrote: Correct me if I'm wrong here, but it appears that you are not interested in finding duplicate entries, but only those entries that are duplicate for the first specific number of characters. In your examples that's 17 characters. In other words, not counting what you refer to as the extension. Is that right? You would need VBA for that but you don't say what you want done with entries that are found to be duplicates (for the first 17 characters). Do you want duplicates highlighted? Moved? Copied? All the entries that are duplicates or do you want to leave one? HTH Otto "K. Gwynn" wrote in message ... Before posting this message I read through all the posts for finding and counting unique entries and tried most of the responses and still have not found a solution that works for my situation. So here goes: I have a spreadsheet with 700+ entries. The records I need to sort on look like this: CMMI2_CM_GP2_08_G CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H CMMI2_CM_GP2_10_P_ND_PP_H In the example above the duplicates are CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H So when I've tried using Advanced Filter, the two that look exactly the same are filtered out, but the one with the "PD_PP_H" extension is not. [No, that extension cannot be filtered out ahead of time and yes, it is important to the data set]. Now what? Thanks in advance for any suggestions and help. |
#5
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
tagging unique items in a list
okay: the entries in my data set can have anywhere between 17 and 27
characters. The entries are broken into character strings by the underscore character. Two groups of character strings are important: those that identify the entry as belonging to a particular subgroup and those that then determine whether within that subgroup that entry is unique. Each subgroup where we determine if one entry is different from another is identified by the characters in positions 7 through 12 or 7 through 14. The subgroup that determines whether an entry is unique within that subgroup is either the last character or starts 8 to 9 characters from the right. Here's another example: (1) CMMI2_MA_SP2_04_P_ND_PP_H (2) CMMI2_PMC_GP2_01_G (3) CMMI2_PPQA_GP2_01_P (4) CMMI2_PPQA_GP2_01_P_PD_PP_H In this example there are three subgroups: line 1; line 2; line 3 & 4 each represent unique subgroups. Because lines 1 & 2 only have one entry in their respective subgroups, they are both unique. The last 8 characters of line 4 (and really line 1) are always superfluous. Therefore, to determine if entries 3 & 4 are duplicates we move past those characters and look at the last character in line 3 and what would be the last character in line 4 if the group of 8 €śextra€ť characters were removed from line 4. When viewed in that way, we see that these entries are duplicates and we only need to keep one entry (doesnt matter which). We now have three unique entries to move to our new column: (1) CMMI2_MA_SP2_04_P_ND_PP_H (2) CMMI2_PMC_GP2_01_G (3) CMMI2_PPQA_GP2_01_P I want to copy the unique set to a new column, not move. I need to keep the original set of data. "Otto Moehrbach" wrote: You say the second set of characters can have different lengths. What about the first set, the set of 17 characters in your example? Also, you say you want the unique entries "moved". Not copied but moved? Otto "K. Gwynn" wrote in message ... Yes, I am interested in only finding the entries that are unique. But its not always the same number of characters. The second set of characters is sometimes as many as four characters long. The stuff at the end could be as short as 3 characters or as long as 5 extra characters. Once identified, I need to have the unique entries moved to another column so I can work just with them. "Otto Moehrbach" wrote: Correct me if I'm wrong here, but it appears that you are not interested in finding duplicate entries, but only those entries that are duplicate for the first specific number of characters. In your examples that's 17 characters. In other words, not counting what you refer to as the extension. Is that right? You would need VBA for that but you don't say what you want done with entries that are found to be duplicates (for the first 17 characters). Do you want duplicates highlighted? Moved? Copied? All the entries that are duplicates or do you want to leave one? HTH Otto "K. Gwynn" wrote in message ... Before posting this message I read through all the posts for finding and counting unique entries and tried most of the responses and still have not found a solution that works for my situation. So here goes: I have a spreadsheet with 700+ entries. The records I need to sort on look like this: CMMI2_CM_GP2_08_G CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H CMMI2_CM_GP2_10_P_ND_PP_H In the example above the duplicates are CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H So when I've tried using Advanced Filter, the two that look exactly the same are filtered out, but the one with the "PD_PP_H" extension is not. [No, that extension cannot be filtered out ahead of time and yes, it is important to the data set]. Now what? Thanks in advance for any suggestions and help. |
#6
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
tagging unique items in a list
You can see for yourself that the variation in what number character means
what makes your explanation too difficult to follow for anyone but you. Do this. Copy the whole column of these entries into Column A of a new sheet starting with A1. Then select all the cells that have entries in this new sheet. Now click on Data - Text to columns. Select delimited. Click the "Other" box and insert the underscore character. Select Text. Click on OK. This puts each piece of each entry into a separate column. Now explain how to determine if a row is unique. Don't get into groups and subgroups. Stick with columns, Column A means this, Column B means that, and so forth. I know that what you explained makes sense to you, but you have to realize that you are in the minority on this. What you want to do is very easy for Excel to do, if I knew what you wanted. Ignore the fact that now all of your entries are distributed throughout a bunch of columns, one piece per column. Excel can put this egg back together again very easily. HTH Otto "K. Gwynn" wrote in message ... okay: the entries in my data set can have anywhere between 17 and 27 characters. The entries are broken into character strings by the underscore character. Two groups of character strings are important: those that identify the entry as belonging to a particular subgroup and those that then determine whether within that subgroup that entry is unique. Each subgroup where we determine if one entry is different from another is identified by the characters in positions 7 through 12 or 7 through 14. The subgroup that determines whether an entry is unique within that subgroup is either the last character or starts 8 to 9 characters from the right. Here's another example: (1) CMMI2_MA_SP2_04_P_ND_PP_H (2) CMMI2_PMC_GP2_01_G (3) CMMI2_PPQA_GP2_01_P (4) CMMI2_PPQA_GP2_01_P_PD_PP_H In this example there are three subgroups: line 1; line 2; line 3 & 4 each represent unique subgroups. Because lines 1 & 2 only have one entry in their respective subgroups, they are both unique. The last 8 characters of line 4 (and really line 1) are always superfluous. Therefore, to determine if entries 3 & 4 are duplicates we move past those characters and look at the last character in line 3 and what would be the last character in line 4 if the group of 8 "extra" characters were removed from line 4. When viewed in that way, we see that these entries are duplicates and we only need to keep one entry (doesn't matter which). We now have three unique entries to move to our new column: (1) CMMI2_MA_SP2_04_P_ND_PP_H (2) CMMI2_PMC_GP2_01_G (3) CMMI2_PPQA_GP2_01_P I want to copy the unique set to a new column, not move. I need to keep the original set of data. "Otto Moehrbach" wrote: You say the second set of characters can have different lengths. What about the first set, the set of 17 characters in your example? Also, you say you want the unique entries "moved". Not copied but moved? Otto "K. Gwynn" wrote in message ... Yes, I am interested in only finding the entries that are unique. But its not always the same number of characters. The second set of characters is sometimes as many as four characters long. The stuff at the end could be as short as 3 characters or as long as 5 extra characters. Once identified, I need to have the unique entries moved to another column so I can work just with them. "Otto Moehrbach" wrote: Correct me if I'm wrong here, but it appears that you are not interested in finding duplicate entries, but only those entries that are duplicate for the first specific number of characters. In your examples that's 17 characters. In other words, not counting what you refer to as the extension. Is that right? You would need VBA for that but you don't say what you want done with entries that are found to be duplicates (for the first 17 characters). Do you want duplicates highlighted? Moved? Copied? All the entries that are duplicates or do you want to leave one? HTH Otto "K. Gwynn" wrote in message ... Before posting this message I read through all the posts for finding and counting unique entries and tried most of the responses and still have not found a solution that works for my situation. So here goes: I have a spreadsheet with 700+ entries. The records I need to sort on look like this: CMMI2_CM_GP2_08_G CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H CMMI2_CM_GP2_10_P_ND_PP_H In the example above the duplicates are CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H So when I've tried using Advanced Filter, the two that look exactly the same are filtered out, but the one with the "PD_PP_H" extension is not. [No, that extension cannot be filtered out ahead of time and yes, it is important to the data set]. Now what? Thanks in advance for any suggestions and help. |
#7
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
tagging unique items in a list
Actually, I've done that too! I guess I then need to put back into a single
column the areas that form the "uniqueness"? I was trying to avoid all those extra steps, but I guess there isn't any way to do that. "Otto Moehrbach" wrote: You can see for yourself that the variation in what number character means what makes your explanation too difficult to follow for anyone but you. Do this. Copy the whole column of these entries into Column A of a new sheet starting with A1. Then select all the cells that have entries in this new sheet. Now click on Data - Text to columns. Select delimited. Click the "Other" box and insert the underscore character. Select Text. Click on OK. This puts each piece of each entry into a separate column. Now explain how to determine if a row is unique. Don't get into groups and subgroups. Stick with columns, Column A means this, Column B means that, and so forth. I know that what you explained makes sense to you, but you have to realize that you are in the minority on this. What you want to do is very easy for Excel to do, if I knew what you wanted. Ignore the fact that now all of your entries are distributed throughout a bunch of columns, one piece per column. Excel can put this egg back together again very easily. HTH Otto "K. Gwynn" wrote in message ... okay: the entries in my data set can have anywhere between 17 and 27 characters. The entries are broken into character strings by the underscore character. Two groups of character strings are important: those that identify the entry as belonging to a particular subgroup and those that then determine whether within that subgroup that entry is unique. Each subgroup where we determine if one entry is different from another is identified by the characters in positions 7 through 12 or 7 through 14. The subgroup that determines whether an entry is unique within that subgroup is either the last character or starts 8 to 9 characters from the right. Here's another example: (1) CMMI2_MA_SP2_04_P_ND_PP_H (2) CMMI2_PMC_GP2_01_G (3) CMMI2_PPQA_GP2_01_P (4) CMMI2_PPQA_GP2_01_P_PD_PP_H In this example there are three subgroups: line 1; line 2; line 3 & 4 each represent unique subgroups. Because lines 1 & 2 only have one entry in their respective subgroups, they are both unique. The last 8 characters of line 4 (and really line 1) are always superfluous. Therefore, to determine if entries 3 & 4 are duplicates we move past those characters and look at the last character in line 3 and what would be the last character in line 4 if the group of 8 "extra" characters were removed from line 4. When viewed in that way, we see that these entries are duplicates and we only need to keep one entry (doesn't matter which). We now have three unique entries to move to our new column: (1) CMMI2_MA_SP2_04_P_ND_PP_H (2) CMMI2_PMC_GP2_01_G (3) CMMI2_PPQA_GP2_01_P I want to copy the unique set to a new column, not move. I need to keep the original set of data. "Otto Moehrbach" wrote: You say the second set of characters can have different lengths. What about the first set, the set of 17 characters in your example? Also, you say you want the unique entries "moved". Not copied but moved? Otto "K. Gwynn" wrote in message ... Yes, I am interested in only finding the entries that are unique. But its not always the same number of characters. The second set of characters is sometimes as many as four characters long. The stuff at the end could be as short as 3 characters or as long as 5 extra characters. Once identified, I need to have the unique entries moved to another column so I can work just with them. "Otto Moehrbach" wrote: Correct me if I'm wrong here, but it appears that you are not interested in finding duplicate entries, but only those entries that are duplicate for the first specific number of characters. In your examples that's 17 characters. In other words, not counting what you refer to as the extension. Is that right? You would need VBA for that but you don't say what you want done with entries that are found to be duplicates (for the first 17 characters). Do you want duplicates highlighted? Moved? Copied? All the entries that are duplicates or do you want to leave one? HTH Otto "K. Gwynn" wrote in message ... Before posting this message I read through all the posts for finding and counting unique entries and tried most of the responses and still have not found a solution that works for my situation. So here goes: I have a spreadsheet with 700+ entries. The records I need to sort on look like this: CMMI2_CM_GP2_08_G CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H CMMI2_CM_GP2_10_P_ND_PP_H In the example above the duplicates are CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H So when I've tried using Advanced Filter, the two that look exactly the same are filtered out, but the one with the "PD_PP_H" extension is not. [No, that extension cannot be filtered out ahead of time and yes, it is important to the data set]. Now what? Thanks in advance for any suggestions and help. |
#8
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
tagging unique items in a list
Thanks for pushing me to think! I got a solution -- it may not have been
exactly what I had wanted (its more steps than I wanted) but I think its going to work for us. Thanks again. "Otto Moehrbach" wrote: You can see for yourself that the variation in what number character means what makes your explanation too difficult to follow for anyone but you. Do this. Copy the whole column of these entries into Column A of a new sheet starting with A1. Then select all the cells that have entries in this new sheet. Now click on Data - Text to columns. Select delimited. Click the "Other" box and insert the underscore character. Select Text. Click on OK. This puts each piece of each entry into a separate column. Now explain how to determine if a row is unique. Don't get into groups and subgroups. Stick with columns, Column A means this, Column B means that, and so forth. I know that what you explained makes sense to you, but you have to realize that you are in the minority on this. What you want to do is very easy for Excel to do, if I knew what you wanted. Ignore the fact that now all of your entries are distributed throughout a bunch of columns, one piece per column. Excel can put this egg back together again very easily. HTH Otto "K. Gwynn" wrote in message ... okay: the entries in my data set can have anywhere between 17 and 27 characters. The entries are broken into character strings by the underscore character. Two groups of character strings are important: those that identify the entry as belonging to a particular subgroup and those that then determine whether within that subgroup that entry is unique. Each subgroup where we determine if one entry is different from another is identified by the characters in positions 7 through 12 or 7 through 14. The subgroup that determines whether an entry is unique within that subgroup is either the last character or starts 8 to 9 characters from the right. Here's another example: (1) CMMI2_MA_SP2_04_P_ND_PP_H (2) CMMI2_PMC_GP2_01_G (3) CMMI2_PPQA_GP2_01_P (4) CMMI2_PPQA_GP2_01_P_PD_PP_H In this example there are three subgroups: line 1; line 2; line 3 & 4 each represent unique subgroups. Because lines 1 & 2 only have one entry in their respective subgroups, they are both unique. The last 8 characters of line 4 (and really line 1) are always superfluous. Therefore, to determine if entries 3 & 4 are duplicates we move past those characters and look at the last character in line 3 and what would be the last character in line 4 if the group of 8 "extra" characters were removed from line 4. When viewed in that way, we see that these entries are duplicates and we only need to keep one entry (doesn't matter which). We now have three unique entries to move to our new column: (1) CMMI2_MA_SP2_04_P_ND_PP_H (2) CMMI2_PMC_GP2_01_G (3) CMMI2_PPQA_GP2_01_P I want to copy the unique set to a new column, not move. I need to keep the original set of data. "Otto Moehrbach" wrote: You say the second set of characters can have different lengths. What about the first set, the set of 17 characters in your example? Also, you say you want the unique entries "moved". Not copied but moved? Otto "K. Gwynn" wrote in message ... Yes, I am interested in only finding the entries that are unique. But its not always the same number of characters. The second set of characters is sometimes as many as four characters long. The stuff at the end could be as short as 3 characters or as long as 5 extra characters. Once identified, I need to have the unique entries moved to another column so I can work just with them. "Otto Moehrbach" wrote: Correct me if I'm wrong here, but it appears that you are not interested in finding duplicate entries, but only those entries that are duplicate for the first specific number of characters. In your examples that's 17 characters. In other words, not counting what you refer to as the extension. Is that right? You would need VBA for that but you don't say what you want done with entries that are found to be duplicates (for the first 17 characters). Do you want duplicates highlighted? Moved? Copied? All the entries that are duplicates or do you want to leave one? HTH Otto "K. Gwynn" wrote in message ... Before posting this message I read through all the posts for finding and counting unique entries and tried most of the responses and still have not found a solution that works for my situation. So here goes: I have a spreadsheet with 700+ entries. The records I need to sort on look like this: CMMI2_CM_GP2_08_G CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H CMMI2_CM_GP2_10_P_ND_PP_H In the example above the duplicates are CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P CMMI2_CM_GP2_09_P_PD_PP_H So when I've tried using Advanced Filter, the two that look exactly the same are filtered out, but the one with the "PD_PP_H" extension is not. [No, that extension cannot be filtered out ahead of time and yes, it is important to the data set]. Now what? Thanks in advance for any suggestions and help. |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Function to count unique items in list | Excel Worksheet Functions | |||
Pivot Table dropping items from page item list | Excel Discussion (Misc queries) | |||
Best way to get a list of unique entries in a field | Excel Worksheet Functions | |||
named range, data validation: list non-selected items, and new added items | Excel Discussion (Misc queries) | |||
unique occurences in list | Excel Worksheet Functions |