Home |
Search |
Today's Posts |
#1
|
|||
|
|||
assigning a unique value
I have a spreadsheet of names, and I need to assign a unique number to each
of the names. Some entries are repeated, so each duplicate should have the same number. The numbers assigned would be in a different column than the names. For example, the result I need would look like this: 1 John Smith 1 John Smith 2 Carla Jones 3 Jenny Fortuna 4 James Wood 4 James Wood and so on. Is there a formula that I can use to accomplish this, so I don't have to go through the entire spreadsheet and type these numbers in? If anyone can help, thanks. |
#2
|
|||
|
|||
Dino wrote:
I have a spreadsheet of names, and I need to assign a unique number to each of the names. Some entries are repeated, so each duplicate should have the same number. The numbers assigned would be in a different column than the names. For example, the result I need would look like this: 1 John Smith 1 John Smith 2 Carla Jones 3 Jenny Fortuna 4 James Wood 4 James Wood and so on. Is there a formula that I can use to accomplish this, so I don't have to go through the entire spreadsheet and type these numbers in? If anyone can help, thanks. If your data is in B1:Bwhatever, then enter =1 in A1; then in A2 enter =IF(B2=B1,A1,A1+1) and fill down Alan Beban |
#3
|
|||
|
|||
Alan Beban wrote...
.... If your data is in B1:Bwhatever, then enter =1 in A1; then in A2 enter =IF(B2=B1,A1,A1+1) and fill down You're assuming all duplicate names are grouped together. If duplicate names wouldn't necessarily be grouped together, use 1 in A1 and A2: =A1+ISERROR(MATCH(B2,B$1:B1,0)) and fill down. |
#4
|
|||
|
|||
IF your name range is in column A beginning from Column A2 type this on
column B2 and copy it all the way down =IF(COUNTIF($A$2:A2,A2)1,VLOOKUP(A2,$A1:B$2,2,0), MAX($B$1:B1)+1) "Dino" wrote: I have a spreadsheet of names, and I need to assign a unique number to each of the names. Some entries are repeated, so each duplicate should have the same number. The numbers assigned would be in a different column than the names. For example, the result I need would look like this: 1 John Smith 1 John Smith 2 Carla Jones 3 Jenny Fortuna 4 James Wood 4 James Wood and so on. Is there a formula that I can use to accomplish this, so I don't have to go through the entire spreadsheet and type these numbers in? If anyone can help, thanks. |
#5
|
|||
|
|||
Harlan Grove wrote:
Alan Beban wrote... ... If your data is in B1:Bwhatever, then enter =1 in A1; then in A2 enter =IF(B2=B1,A1,A1+1) and fill down You're assuming all duplicate names are grouped together. If duplicate names wouldn't necessarily be grouped together, use 1 in A1 and A2: =A1+ISERROR(MATCH(B2,B$1:B1,0)) and fill down. ????! Alan Beban |
#6
|
|||
|
|||
Alan Beban wrote:
Harlan Grove wrote: Alan Beban wrote... ... If your data is in B1:Bwhatever, then enter =1 in A1; then in A2 enter =IF(B2=B1,A1,A1+1) and fill down You're assuming all duplicate names are grouped together. If duplicate names wouldn't necessarily be grouped together, use 1 in A1 and A2: =A1+ISERROR(MATCH(B2,B$1:B1,0)) and fill down. ????! Alan Beban For the case in which the names might not be grouped together, I have a solution with two helper columns that I will post if something more efficient doesn't show up. Alan Beban |
#7
|
|||
|
|||
N Harkawat wrote...
IF your name range is in column A beginning from Column A2 type this on column B2 and copy it all the way down =IF(COUNTIF($A$2:A2,A2)1,VLOOKUP(A2,$A1:B$2,2,0) ,MAX($B$1:B1)+1) .... The issue with COUNTIF is that it'll iterate through all cells in its 1st argument range. MATCH with 0 3rd argument will return on finding the first match. Also, OP's data had names in col B and numbers in col A. In that case, need to use INDEX(.,MATCH()) rather than VLOOKUP. For one cell per result, try these formulas. A1: 1 A2: =1+(B2<B1) A3: =IF(ISERROR(MATCH(B3,B$1:B2,0)),MAX(A$1:A2)+1, INDEX(A$1:A2,MATCH(B3,B$1:B2,0))) Fill A3 down as needed. For recalc efficiency, better to enter the MATCH calls and cache the running max col A values in other columns, so 3 cells per result. A1 and A2 cells remain the same as above, but A3 down need 2 ancillary cells (I'll use cols X and Y). A3: =IF(ISERROR(X3),MAX(A$1:A2)+1,INDEX(A$1:A2,X3)) X3: =MATCH(B3,B$1:B2,0) Y3: =MAX(A1:A2) Y4: =Y3+ISERROR(X3) Fill A3 and X3 down as needed, fill Y4 down as needed. |
#8
|
|||
|
|||
And here's an alternative set of formulas:
A1: =VLOOKUP(B1,B$1:D1,3,0) C1: =SUM(1/COUNTIF(B$1:B1,B$1:B1)) array entered D1: 1 D2: =IF(B$2=B$1,IF(COUNTIF(B$1:B2,B2)=1,MAX(C$1:C1)+1, MATCH(B2,B$2:B2,0)),IF(COUNTIF(B$1:B2,B2)=1,MAX(C$ 1:C1)+1,MATCH(B2,B$1:B2,0))) Fill down A1, C1, D2 to the row of the end of the name list. Alan Beban Harlan Grove wrote: N Harkawat wrote... IF your name range is in column A beginning from Column A2 type this on column B2 and copy it all the way down =IF(COUNTIF($A$2:A2,A2)1,VLOOKUP(A2,$A1:B$2,2,0 ),MAX($B$1:B1)+1) ... The issue with COUNTIF is that it'll iterate through all cells in its 1st argument range. MATCH with 0 3rd argument will return on finding the first match. Also, OP's data had names in col B and numbers in col A. In that case, need to use INDEX(.,MATCH()) rather than VLOOKUP. For one cell per result, try these formulas. A1: 1 A2: =1+(B2<B1) A3: =IF(ISERROR(MATCH(B3,B$1:B2,0)),MAX(A$1:A2)+1, INDEX(A$1:A2,MATCH(B3,B$1:B2,0))) Fill A3 down as needed. For recalc efficiency, better to enter the MATCH calls and cache the running max col A values in other columns, so 3 cells per result. A1 and A2 cells remain the same as above, but A3 down need 2 ancillary cells (I'll use cols X and Y). A3: =IF(ISERROR(X3),MAX(A$1:A2)+1,INDEX(A$1:A2,X3)) X3: =MATCH(B3,B$1:B2,0) Y3: =MAX(A1:A2) Y4: =Y3+ISERROR(X3) Fill A3 and X3 down as needed, fill Y4 down as needed. |
#9
|
|||
|
|||
"Alan Beban" wrote...
And here's an alternative set of formulas: A1: =VLOOKUP(B1,B$1:D1,3,0) C1: =SUM(1/COUNTIF(B$1:B1,B$1:B1)) array entered Or without array entry, =SUMPRODUCT(1/COUNTIF(B$1:B1,B$1:B1)) D1: 1 D2: =IF(B$2=B$1,IF(COUNTIF(B$1:B2,B2)=1,MAX(C$1:C1)+1 , MATCH(B2,B$2:B2,0)),IF(COUNTIF(B$1:B2,B2)=1,MAX(C $1:C1)+1, MATCH(B2,B$1:B2,0))) Starting with row 2, the function call counts on each row are 1 VLOOKUP O(K) 1 COUNTIF(r,r) O(K^2) 1 SUM[PRODUCT] O(K) 2 COUNTIF(r,x) O(K) both identical 2 MAX O(K) both identical 2 MATCH O(K) 3 IF O(1) 12 in total where K is the row number. Copied down through N rows, the COUNTIF calls in col C dominate, making the overall approach O(N^2 log(N)). Compare the foregoing to the following single cell formula. A3: =IF(ISERROR(MATCH(B3,B$1:B2,0)),MAX(A$1:A2)+1, INDEX(A$1:A2,MATCH(B3,B$1:B2,0))) Starting in row 3, the function call counts on each row are 2 MATCH O(K) both identical 1 MAX O(K) 1 INDEX O(1) 1 ISERROR O(1) 1 IF O(1) 6 in total N rows each containing such formulas, so overall O(N log(N)). What's the benefit of your formulas? It's clearly not efficiency, either in terms of recalc speed, disk storage or RAM usage. Then there's the 3 cell/result formulas. Fixing the A3 and Y4 formulas, A3: =IF(ISERROR(X3),Y3,INDEX(A$1:A2,X3)) X3: =MATCH(B3,B$1:B2,0) Y3: =MAX(A1:A2) Y4: =Y3+ISERROR(X4) The Y3 formula is O(1), constant time. Starting with row 4, 1 INDEX O(1) 2 ISERROR O(1) 1 IF O(1) 1 MATCH O(K) 5 in total Over N rows, O(N log(N)) again, but with a constant factor reduction since only one O(K) function call on each row. Again, what's the benefit of your formulas? |
#10
|
|||
|
|||
Thanks for all your input. I have to try these solutions and see what the
results are. "Dino" wrote: I have a spreadsheet of names, and I need to assign a unique number to each of the names. Some entries are repeated, so each duplicate should have the same number. The numbers assigned would be in a different column than the names. For example, the result I need would look like this: 1 John Smith 1 John Smith 2 Carla Jones 3 Jenny Fortuna 4 James Wood 4 James Wood and so on. Is there a formula that I can use to accomplish this, so I don't have to go through the entire spreadsheet and type these numbers in? If anyone can help, thanks. |
#11
|
|||
|
|||
Harlan Grove wrote:
. . . What's the benefit of your formulas? It's clearly not efficiency, either in terms of recalc speed, disk storage or RAM usage. I didn't claim any advantage for those formulas; simply that they were an alternative. I started working on them in an effort to provide something that might be useful after that little snippet of silliness that you proffered as a solution(!) for the case of an unsorted list of names in your first posting in this thread. (By the way, some might be wondering about the acknowledgment of your mistakes that you are wont to claim you always step up to.) Since I had done the work, I posted the result. The differences in performance of the several approaches, notwithstanding your esoteric analysis, are probably trivial for many (most?) users in many (most?) circumstances. But my posting the formulas did provide you another opportunity to pontificate; so you ought to be grateful, not snotty. But then you wouldn't be Harlan Grove, would you? Alan Beban |
#12
|
|||
|
|||
Just out of curiosity, are your names grouped or ungrouped? If they are
in fact grouped, then the simplest approach first posted should be considered. Alan Beban Dino wrote: Thanks for all your input. I have to try these solutions and see what the results are. "Dino" wrote: I have a spreadsheet of names, and I need to assign a unique number to each of the names. Some entries are repeated, so each duplicate should have the same number. The numbers assigned would be in a different column than the names. For example, the result I need would look like this: 1 John Smith 1 John Smith 2 Carla Jones 3 Jenny Fortuna 4 James Wood 4 James Wood and so on. Is there a formula that I can use to accomplish this, so I don't have to go through the entire spreadsheet and type these numbers in? If anyone can help, thanks. |
#13
|
|||
|
|||
Alan Beban wrote...
.... I didn't claim any advantage for those formulas; simply that they were an alternative. . . . As in '=1+1+1+1+1+1+1' is an alternative way of representing 7? Some alternatives are bad ideas. Sadly, some people are incapable of grasping that simple fact. . . . I started working on them in an effort to provide something that might be useful after that little snippet of silliness that you proffered as a solution(!) for the case of an unsorted list of names in your first posting in this thread. (By the way, some might be wondering about the acknowledgment of your mistakes that you are wont to claim you always step up to.) Since I had done the work, I posted the result. Reread my preceding message to which you responded. Especially, notice the phrase, "Fixing the A3 and Y4 formulas." I'll be more explicit, FIXING *MY* PREVIOUS A3 and Y4 FORMULAS, AND THE FORMULAS IN MY FIRST RESPONSE DIDN'T WORK AT ALL. Happy? The differences in performance of the several approaches, notwithstanding your esoteric analysis, are probably trivial for many (most?) users in many (most?) circumstances. .... In which case there are considerable advantages to the single formula/result approach, namely, A3: =IF(ISERROR(MATCH(B3,B$1:B2,0)),MAX(A$1:A2)+1, INDEX(A$1:A2,MATCH(B3,B$1:B2,0))) As for esoteric, it explains a lot that you're apparently indifferent between O(N) and O(N^2) approaches to solving problems. |
#14
|
|||
|
|||
Alan Beban wrote...
Just out of curiosity, are your names grouped or ungrouped? If they are in fact grouped, then the simplest approach first posted should be considered. Granted. If names are grouped, then Alan's original formula is optimal. However, OP showed unsorted sample data. What's the *reasonable* assumption to make when presented with unsorted data? |
#15
|
|||
|
|||
Harlan Grove wrote:
.... . . . If duplicate names wouldn't necessarily be grouped together, use 1 in A1 and A2: =A1+ISERROR(MATCH(B2,B$1:B1,0)) Full disclosu this is FUBAR. |
#16
|
|||
|
|||
Harlan Grove wrote:
Alan Beban wrote... Just out of curiosity, are your names grouped or ungrouped? If they are in fact grouped, then the simplest approach first posted should be considered. Granted. If names are grouped, then Alan's original formula is optimal. However, OP showed unsorted sample data. No; the OP showed grouped, unsorted sample data. What's the *reasonable* assumption to make when presented with unsorted data? Oh please! Why are you driven to making a contest out of the dumbest issues? The question is whether the OP showed grouped, unsorted data because his real data is grouped and unsorted, or because he didn't realize that his sample data didn't represent his real data. Either assumption seems reasonable to me; and in the absence of any knowledge about the OP, other than that he went to the trouble of grouping the data, the first seems a bit more reasonable than the second. Might not be the correct assumption, but certainly reasonable. Alan Beban |
#17
|
|||
|
|||
Harlan Grove wrote:
. . . Reread my preceding message to which you responded. Especially, notice the phrase, "Fixing the A3 and Y4 formulas." I'll be more explicit, FIXING *MY* PREVIOUS A3 and Y4 FORMULAS, AND THE FORMULAS IN MY FIRST RESPONSE DIDN'T WORK AT ALL. Happy? . . . I don't care a fig either way; you're the one who's made a big point of acknowledging your mistakes. Alan Beban |
#18
|
|||
|
|||
Alan Beban wrote...
Harlan Grove wrote: .... What's the *reasonable* assumption to make when presented with unsorted data? .... . . . The question is whether the OP showed grouped, unsorted data because his real data is grouped and unsorted, or because he didn't realize that his sample data didn't represent his real data. . . . .... In my experience grouped but unsorted data is exceedingly rare because far & away the easiest way to group data is to sort it. Also, and you may come to realize this as you gain experience in these newsgroups, OPs seldom provide sample data that's truly representative of their real data. . . . other than that he went to the trouble of grouping the data, . .. . .... Unlikely the OP went to any trouble doing so. More likely he did it to make the repeating codes in the first column more obvious. |
#19
|
|||
|
|||
Harlan Grove wrote:
. . . OPs seldom provide sample data that's truly representative of their real data. . . . Pure, unadulterated, characteristically arrogant and self-serving BS. Alan Beban |
#20
|
|||
|
|||
"Alan Beban" wrote...
.... Pure, unadulterated, characteristically arrogant and self-serving BS. As opposed to your characteristic militant ignorance. Take a peek at the number of OP follow-ups in which they need to provide additional details. For example, http://groups-beta.google.com/group/...5?dmode=source (or http://makeashorterlink.com/?C6952670B ). |
#21
|
|||
|
|||
Harlan Grove wrote:
"Alan Beban" wrote... ... Pure, unadulterated, characteristically arrogant and self-serving BS. As opposed to your characteristic militant ignorance. Take a peek at the number of OP follow-ups in which they need to provide additional details. For example, http://groups-beta.google.com/group/...5?dmode=source (or http://makeashorterlink.com/?C6952670B ). Did I miss something, or did you cite the existence of one single post to support your statement "OPs *seldom* provide sample data that's truly representative of their real data" [Emphasis added]? Alan Beban |
#22
|
|||
|
|||
"Alan Beban" wrote...
.... (or http://makeashorterlink.com/?C6952670B ). Did I miss something, or did you cite the existence of one single post to support your statement "OPs *seldom* provide sample data that's truly representative of their real data" [Emphasis added]? I thought you'd be too stupid to understand the common meaning of "i'm afraid i didn't give enough info on my first post." and too stupid or lazy to compare the info in the first post and the follow-up. Having a difficult time understanding English? Can't follow the dots? |
#23
|
|||
|
|||
One problem is that I have to re-sort according to different criteria
constantly. I have many other columns in the spreadsheet that I have to sort by. So the solution would have to be one that would not be affected by the names being ungrouped. "Alan Beban" wrote: Just out of curiosity, are your names grouped or ungrouped? If they are in fact grouped, then the simplest approach first posted should be considered. Alan Beban Dino wrote: Thanks for all your input. I have to try these solutions and see what the results are. "Dino" wrote: I have a spreadsheet of names, and I need to assign a unique number to each of the names. Some entries are repeated, so each duplicate should have the same number. The numbers assigned would be in a different column than the names. For example, the result I need would look like this: 1 John Smith 1 John Smith 2 Carla Jones 3 Jenny Fortuna 4 James Wood 4 James Wood and so on. Is there a formula that I can use to accomplish this, so I don't have to go through the entire spreadsheet and type these numbers in? If anyone can help, thanks. |
#24
|
|||
|
|||
Dino wrote...
One problem is that I have to re-sort according to different criteria constantly. I have many other columns in the spreadsheet that I have to sort by. So the solution would have to be one that would not be affected by the names being ungrouped. .... Another example of specs becoming more complete as the thread progresses. Is there any chance Alan Beban will ever learn this lesson? I'm not hopeful. |
#25
|
|||
|
|||
Dino wrote...
One problem is that I have to re-sort according to different criteria constantly. I have many other columns in the spreadsheet that I have to sort by. So the solution would have to be one that would not be affected by the names being ungrouped. If you'd be resorting constantly, would you be expecting the numbers corresponding to the names to vary after each sort? If not, sort by name and use Alan Beban's original formula, then copy the range of numbers and paste special as values onto itself to 'freeze' those numbers. If you need the numbers constantly changing, see the formulas in my later posts in this thread. |
#26
|
|||
|
|||
Thanks all for your input. The formula from the post by N. Harkawat worked! I
did not try the formulas after that post, although I'm sure all have merit. I can now sort differently, and it doesn't affect the result. "Dino" wrote: Thanks for all your input. I have to try these solutions and see what the results are. "Dino" wrote: I have a spreadsheet of names, and I need to assign a unique number to each of the names. Some entries are repeated, so each duplicate should have the same number. The numbers assigned would be in a different column than the names. For example, the result I need would look like this: 1 John Smith 1 John Smith 2 Carla Jones 3 Jenny Fortuna 4 James Wood 4 James Wood and so on. Is there a formula that I can use to accomplish this, so I don't have to go through the entire spreadsheet and type these numbers in? If anyone can help, thanks. |
#27
|
|||
|
|||
Harlan Grove wrote:
"Alan Beban" wrote... ... (or http://makeashorterlink.com/?C6952670B ). Did I miss something, or did you cite the existence of one single post to support your statement "OPs *seldom* provide sample data that's truly representative of their real data" [Emphasis added]? I thought you'd be too stupid to understand the common meaning of "i'm afraid i didn't give enough info on my first post." and too stupid or lazy to compare the info in the first post and the follow-up. Having a difficult time understanding English? Can't follow the dots? Ah yes. When the going gets tough, start name-calling and try to divert the readers from the point. I'm having trouble even "finding" the dots. Let me ask it more clearly so you can stop pretending to have missed the point: Are you suggestng that the existence of a single thread, in which one particular OP failed to provide representative sample data, proves your hyperbolic assertion that OPs "seldom" provide respresentative sample data? So by that "logic", if one wanted to prove that OPs "usually" provide representative sample data he/she could do that by pointing to a single thread in which the OP did provide representative sample data? Alan Beban |
#28
|
|||
|
|||
Alan Beban wrote...
.... I'm having trouble even "finding" the dots. Let me ask it more clearly so you can stop pretending to have missed the point: Are you suggestng that the existence of a single thread, in which one particular OP failed to provide representative sample data, proves your hyperbolic assertion that OPs "seldom" provide respresentative sample data? So by that "logic", if one wanted to prove that OPs "usually" provide representative sample data he/she could do that by pointing to a single thread in which the OP did provide representative sample data? It was incomplete sample data, lacking mention of the 4th field in the OP. Didn't catch that, did you? There's a lot that seems to get past you most days. More threads. You may have to read a few messages in each of them. http://groups-beta.google.com/group/...a600be18c43a47 http://groups-beta.google.com/group/...2146ccde98b682 |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
How do I return the unique entries from a column to a listbox | Excel Worksheet Functions | |||
find rows for unique data in 1 column and different data in other. | Excel Discussion (Misc queries) | |||
How do I count or display unique data in a column? | Excel Worksheet Functions | |||
Count of unique items meeting condition | Excel Worksheet Functions | |||
UNIQUE | Excel Worksheet Functions |