Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Counting unique numbers
Hello everyone,
I have a large set of data that has in one column, the company id which repeats on consecutive rows for each year in the dataset. Then in the next column, there's the ID of each segment within the company (a number). I need to count for each company code, how many segments (unique numbers) are there. The data for a small sample looks like this: COMP-ID SEG-ID 1001 1 1001 1 1001 1 1003 1 1003 1 1003 1 1003 1 1004 3 1004 4 1004 5 1004 3 1004 4 1004 5 1004 3 in this case I would need to generate a list like this 1001 1 1003 1 1004 3 which means that company 1001 has 1 segment, 1003 has 1 segment and company 1004 has 3 segments (segments 3, 4 and 5). I hope anybody can shed some light on how to tackle this one because I haven't been able to figure something out. Thanks in advance, Juan Marin |
#2
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Counting unique numbers
Try this array formula** :
Data in the range A2:B15 Assuming no empty cells in column B D2:Dn = 1001, 1003, 1004 etc Array entered** in E2 and copied down as needed: =SUM(IF(FREQUENCY(IF(A$2:A$15=D2,B$2:B$15),B$2:B$1 5),1)) ** array formulas need to be entered using the key combination of CTRL,SHIFT,ENTER (not just ENTER). Hold down both the CTRL key and the SHIFT key then hit ENTER. -- Biff Microsoft Excel MVP "Juan Marin" wrote in message ... Hello everyone, I have a large set of data that has in one column, the company id which repeats on consecutive rows for each year in the dataset. Then in the next column, there's the ID of each segment within the company (a number). I need to count for each company code, how many segments (unique numbers) are there. The data for a small sample looks like this: COMP-ID SEG-ID 1001 1 1001 1 1001 1 1003 1 1003 1 1003 1 1003 1 1004 3 1004 4 1004 5 1004 3 1004 4 1004 5 1004 3 in this case I would need to generate a list like this 1001 1 1003 1 1004 3 which means that company 1001 has 1 segment, 1003 has 1 segment and company 1004 has 3 segments (segments 3, 4 and 5). I hope anybody can shed some light on how to tackle this one because I haven't been able to figure something out. Thanks in advance, Juan Marin |
#3
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Counting unique numbers
On Mon, 10 Aug 2009 21:07:01 -0700 (PDT), Juan Marin
wrote: Hello everyone, I have a large set of data that has in one column, the company id which repeats on consecutive rows for each year in the dataset. Then in the next column, there's the ID of each segment within the company (a number). I need to count for each company code, how many segments (unique numbers) are there. The data for a small sample looks like this: COMP-ID SEG-ID 1001 1 1001 1 1001 1 1003 1 1003 1 1003 1 1003 1 1004 3 1004 4 1004 5 1004 3 1004 4 1004 5 1004 3 in this case I would need to generate a list like this 1001 1 1003 1 1004 3 which means that company 1001 has 1 segment, 1003 has 1 segment and company 1004 has 3 segments (segments 3, 4 and 5). I hope anybody can shed some light on how to tackle this one because I haven't been able to figure something out. Thanks in advance, Juan Marin Add a column to your data. If your has the column labels in A1:B1, and the data starts in A2, then C1: Unique Segs C2: =IF(SUMPRODUCT(($A$2:$A2=A2)*($B$2:$B2=B2))1,0,1) Fill down as far as needed. Then Insert/Pivot Table. COMP-ID to Row area Unique Segs to Values or Data area Make sure your "Unique Segs" value field setting is set to do a SUM --ron |
#4
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Counting unique numbers
Thank you so much Ron and Biff, both solutions work pretty well.
Nevertheless I'm encountering one more difficulty, my data set is really large (+400,000 rows) so pasting down any of these formulas is not practical because as I paste them down, they stay fixed to the first cell of the range. However, I know that the data is ordered, and that in any case, if I were able to look just 100 rows forward and 100 rows ahead, that would be enough. I've been trying to fix both formulas with ADDRESS() in such a way that I could vary the initial and ending rows in such way, but I haven't been able to do so effectively. Any thoughts? Thank you again. JM |
#5
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Counting unique numbers
my data set is really large (+400,000 rows)
Yeah, that could slow things down when you're counting uniques on that size range! I know that the data is ordered Despite the length of this formula it's much better! It narrows down the range to only those cells that are of the same ID. Array entered** =SUM(IF(FREQUENCY(IF(INDEX(A:A,MATCH(D2,A:A,0)):IN DEX(A:A,MATCH(D2,A:A,0)+COUNTIF(A:A,D2)-1)=D2,INDEX(B:B,MATCH(D2,A:A,0)):INDEX(B:B,MATCH(D 2,A:A,0)+COUNTIF(A:A,D2)-1)),INDEX(B:B,MATCH(D2,A:A,0)):INDEX(B:B,MATCH(D2, A:A,0)+COUNTIF(A:A,D2)-1)),1)) If you can download and install the free Morefunc add-in from: http://xcell05.free.fr/morefunc/english/index.htm Alternate download site: http://www.download.com/Morefunc/300...-10423159.html Then you can use this slightly shorter array entered** version: =SUM(IF(FREQUENCY(IF(INDEX(A:A,MATCH(D2,A:A,0)):IN DEX(A:A,MATCH(D2,A:A,0)+COUNTIF(A:A,D2)-1)=D2,SETV(INDEX(B:B,MATCH(D2,A:A,0)):INDEX(B:B,MA TCH(D2,A:A,0)+COUNTIF(A:A,D2)-1))),GETV()),1))** array formulas need to be entered using the key combination ofCTRL,SHIFT,ENTER (not just ENTER). Hold down both the CTRL key and the SHIFTkey then hit ENTER.--BiffMicrosoft Excel MVP"Juan Marin" wrote in ... Thank you so much Ron and Biff, both solutions work pretty well. Nevertheless I'm encountering one more difficulty, my data set is really large (+400,000 rows) so pasting down any of these formulas is not practical because as I paste them down, they stay fixed to the first cell of the range. However, I know that the data is ordered, and that in any case, if I were able to look just 100 rows forward and 100 rows ahead, that would be enough. I've been trying to fix both formulas with ADDRESS() in such a way that I could vary the initial and ending rows in such way, but I haven't been able to do so effectively. Any thoughts? Thank you again. JM |
#6
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Counting unique numbers
Hello Juan,
I would not use worksheet functions to calculate this. See here, please: http://sulprobil.com/html/count_unique.html I would suggest to use my UDF Pfreq: http://sulprobil.com/html/pfreq.html Use the =Pfreq(Pfreq(A1:A400000,B1:B400000)) approach if its working. Try my UDF Pstat if your data amount is too large for Pfreq. http://sulprobil.com/html/pstat.html Regards, Bernd |
#7
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Counting unique numbers
With this amount of data, perhaps you could use the
remove duplicates Excel 07 command or advanced filter for unique values and then use a PivotTable or formula? Otherwise try the Import Data command (under Data Import External Data in XL02 menus) Select Excel files, locate the current file and click Open. Choose the Data range eg Sheet1 (or a named range) then OK. Now Choose Edit Query, Command Type: SQL and Command Text: SELECT [COMP-ID],COUNT(*) AS [Segs] FROM (SELECT DISTINCT * FROM [Sheet1$]) GROUP BY [COMP-ID] Click OK, select the output cell and click OK again. For 65536 rows and around 10000 COMP-IDs this takes only a second to run in tests, other solutions froze the application. "Juan Marin" wrote: Hello everyone, I have a large set of data that has in one column, the company id which repeats on consecutive rows for each year in the dataset. Then in the next column, there's the ID of each segment within the company (a number). I need to count for each company code, how many segments (unique numbers) are there. The data for a small sample looks like this: COMP-ID SEG-ID 1001 1 1001 1 1001 1 1003 1 1003 1 1003 1 1003 1 1004 3 1004 4 1004 5 1004 3 1004 4 1004 5 1004 3 in this case I would need to generate a list like this 1001 1 1003 1 1004 3 which means that company 1001 has 1 segment, 1003 has 1 segment and company 1004 has 3 segments (segments 3, 4 and 5). I hope anybody can shed some light on how to tackle this one because I haven't been able to figure something out. Thanks in advance, Juan Marin |
#8
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Counting unique numbers
Hello Lori,
400,000 records are admittedly asking for a database approach. I optimized the memory strategy of my UDF Pfreq and its working now for 65,536 rows with 10,000 different items. Runtime on my dualcore proc: about 6 seconds. Regards, Bernd |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
counting and unique | Excel Discussion (Misc queries) | |||
Counting Unique | Excel Worksheet Functions | |||
Unique counting | Excel Discussion (Misc queries) | |||
Counting Unique Part Numbers In A Range | Excel Worksheet Functions | |||
counting unique numbers in filtered data | Excel Worksheet Functions |