Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract duplicate rows to another worksheet
I have a client history that includes similar (Name, address) and dissimilar
(date) information. I want to extract duplicate records to another worksheet and leave only 1 record per client on the other. Problem is querying only the first 3 columns, LastName, FirstName & Address and ignoring the remaining 3 columns that have dissimilar information and the system would consider unique. There are 22,000 records. Any ideas most appreciated |
#2
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract duplicate rows to another worksheet
Here's an approach pulls the unique records onto another sheet:
Assumptions: Sheet1 contains your data in cells A1:C1000 Sheet2 is where you want the extracted unique data to be displayed Using Sheet2: A1: LastName B1: FirstName C1: Address InsertNameDefine Names in workbook: Sheet2!Extract Refers to: =Sheet2!$A$1:$C$1 Still using Sheet2: InsertNameDefine Names in workbook: Sheet2!Database Refers to: =Sheet1!$A$1:$C$1000 (Notice: you are on Sheet2, and creating a Sheet2-level range name, but the referenced range is on Sheet1) The reason: An advanced filter cannot SEND data to another sheet, but it can PULL data FROM another sheet. Now...set up the Advanced Data Filter: <Data<Filter<Advanced Filter Select: Copy to another location List Range: (press F3 and select Database) Criteria Range: (none for this exercise) Copy To: (press F3 and select Extract) Check: Unique records only Click [OK] Note: if you want to run that Advanced Data Filter again, you'll need to re-select the range "Database" each time Is that something you can work with? *********** Regards, Ron XL2002, WinXP "Northwoods" wrote: I have a client history that includes similar (Name, address) and dissimilar (date) information. I want to extract duplicate records to another worksheet and leave only 1 record per client on the other. Problem is querying only the first 3 columns, LastName, FirstName & Address and ignoring the remaining 3 columns that have dissimilar information and the system would consider unique. There are 22,000 records. Any ideas most appreciated |
#3
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract duplicate rows to another worksheet
Just another option to play with ..
Assume source data is in a sheet named: X within cols A to G, data from row2 down to row 22000 where cols A to C are the key cols: LastName, FirstName & Address In a new sheet named: U (for "Uniques", say) Paste the 6 col headers from X's A1:F1 into B1:G1 Put in A2: =IF(COUNTBLANK(X!A2:C2)=3,"",IF(SUMPRODUCT((X!$A$2 :A2=X!A2)*(X!$B$2:B2=X!B2)*(X!$C$2:C2=X!C2))<2,ROW (),"")) Copy A2 down to cover the max expected extent of source data, say down to A22200. (Leave A1 blank) Then put in B2: =IF(ROW(A1)COUNT($A:$A),"",INDEX(X!A:A,MATCH(SMAL L($A:$A,ROW(A1)),$A:$A,0))) Copy B2 to G2. Select B2:G2, fill down by the smallest extent sufficient to cover the max expected number of unique lines from X, say down to G5000. When the entire uniques extract is exhausted you'd see only "blank" rows. Cols B to G will extract only the unique* lines from X, with all results neatly bunched at the top. Hide away col A if desired. *ie lines in X with unique: LastName - FirstName - Address Then dress up U nicely, make a copy of it, rename the copy as say: D (for "Duplicates"). In sheet: D, Replace the formula in A2 by: =IF(COUNTBLANK(X!A2:C2)=3,"",IF(U!A2="",ROW(),"")) then copy A2 down to A22200 as before. (Leave A1 blank) Select B2:G2 (same formulas - no change), fill down by the smallest extent sufficient to cover the max expected "balance" duplicate lines from X, say down to G17200 (assuming 5,000 max unique lines, balance duplicates expected is 22200 - 5000 = 17200 roughly). When the entire duplicates extract is exhausted you'd see only "blank" rows, as in the uniques sheet U. Cols B to G will extract only the duplicate lines from X, with all results neatly bunched at the top. Hide away col A if desired. Easiest way to try out the above is to simply rename a copy of your source sheet as: X, ensure the data structure is as assumed, then plug-in the formulas "as-is", and power away. You can always restore the source sheet's name later and leave it to Excel to auto-change the formulas to suit. -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- "Northwoods" wrote: I have a client history that includes similar (Name, address) and dissimilar (date) information. I want to extract duplicate records to another worksheet and leave only 1 record per client on the other. Problem is querying only the first 3 columns, LastName, FirstName & Address and ignoring the remaining 3 columns that have dissimilar information and the system would consider unique. There are 22,000 records. Any ideas most appreciated |
#4
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract duplicate rows to another worksheet
Thanks everyone,
I am going to get to work and post the results. Northwoods "Max" wrote: Just another option to play with .. Assume source data is in a sheet named: X within cols A to G, data from row2 down to row 22000 where cols A to C are the key cols: LastName, FirstName & Address In a new sheet named: U (for "Uniques", say) Paste the 6 col headers from X's A1:F1 into B1:G1 Put in A2: =IF(COUNTBLANK(X!A2:C2)=3,"",IF(SUMPRODUCT((X!$A$2 :A2=X!A2)*(X!$B$2:B2=X!B2)*(X!$C$2:C2=X!C2))<2,ROW (),"")) Copy A2 down to cover the max expected extent of source data, say down to A22200. (Leave A1 blank) Then put in B2: =IF(ROW(A1)COUNT($A:$A),"",INDEX(X!A:A,MATCH(SMAL L($A:$A,ROW(A1)),$A:$A,0))) Copy B2 to G2. Select B2:G2, fill down by the smallest extent sufficient to cover the max expected number of unique lines from X, say down to G5000. When the entire uniques extract is exhausted you'd see only "blank" rows. Cols B to G will extract only the unique* lines from X, with all results neatly bunched at the top. Hide away col A if desired. *ie lines in X with unique: LastName - FirstName - Address Then dress up U nicely, make a copy of it, rename the copy as say: D (for "Duplicates"). In sheet: D, Replace the formula in A2 by: =IF(COUNTBLANK(X!A2:C2)=3,"",IF(U!A2="",ROW(),"")) then copy A2 down to A22200 as before. (Leave A1 blank) Select B2:G2 (same formulas - no change), fill down by the smallest extent sufficient to cover the max expected "balance" duplicate lines from X, say down to G17200 (assuming 5,000 max unique lines, balance duplicates expected is 22200 - 5000 = 17200 roughly). When the entire duplicates extract is exhausted you'd see only "blank" rows, as in the uniques sheet U. Cols B to G will extract only the duplicate lines from X, with all results neatly bunched at the top. Hide away col A if desired. Easiest way to try out the above is to simply rename a copy of your source sheet as: X, ensure the data structure is as assumed, then plug-in the formulas "as-is", and power away. You can always restore the source sheet's name later and leave it to Excel to auto-change the formulas to suit. -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- "Northwoods" wrote: I have a client history that includes similar (Name, address) and dissimilar (date) information. I want to extract duplicate records to another worksheet and leave only 1 record per client on the other. Problem is querying only the first 3 columns, LastName, FirstName & Address and ignoring the remaining 3 columns that have dissimilar information and the system would consider unique. There are 22,000 records. Any ideas most appreciated |
#5
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract duplicate rows to another worksheet
Thanks for posting back !
Let us know how it went for you -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- "Northwoods" wrote in message ... Thanks everyone, I am going to get to work and post the results. Northwoods |
#6
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract duplicate rows to another worksheet
Hi Ron, your solution worked great but I miss the necessary 3 extra columns.
They can't be used to determine unique records because it is dissimular data. However, when all is said and done, I still need those 3 columns included in the Unique list. Max, I need to know a quick way to copy the formula down to the 22,000 row. I was dragging down and then I thought hey there must be a better way. Any tips? Thanks fands "Ron Coderre" wrote: Here's an approach pulls the unique records onto another sheet: Assumptions: Sheet1 contains your data in cells A1:C1000 Sheet2 is where you want the extracted unique data to be displayed Using Sheet2: A1: LastName B1: FirstName C1: Address InsertNameDefine Names in workbook: Sheet2!Extract Refers to: =Sheet2!$A$1:$C$1 Still using Sheet2: InsertNameDefine Names in workbook: Sheet2!Database Refers to: =Sheet1!$A$1:$C$1000 (Notice: you are on Sheet2, and creating a Sheet2-level range name, but the referenced range is on Sheet1) The reason: An advanced filter cannot SEND data to another sheet, but it can PULL data FROM another sheet. Now...set up the Advanced Data Filter: <Data<Filter<Advanced Filter Select: Copy to another location List Range: (press F3 and select Database) Criteria Range: (none for this exercise) Copy To: (press F3 and select Extract) Check: Unique records only Click [OK] Note: if you want to run that Advanced Data Filter again, you'll need to re-select the range "Database" each time Is that something you can work with? *********** Regards, Ron XL2002, WinXP "Northwoods" wrote: I have a client history that includes similar (Name, address) and dissimilar (date) information. I want to extract duplicate records to another worksheet and leave only 1 record per client on the other. Problem is querying only the first 3 columns, LastName, FirstName & Address and ignoring the remaining 3 columns that have dissimilar information and the system would consider unique. There are 22,000 records. Any ideas most appreciated |
#7
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract duplicate rows to another worksheet
I need a little more information....
A client has 5 records, all with identical LastName, FirstName, and Address values. I'm guessing that the other fields contain other information, different for each record. Of those 5 records, what deteremines which record contains the data in the other fields that you want associated with the "unique" record? *********** Regards, Ron XL2002, WinXP "Northwoods" wrote: Hi Ron, your solution worked great but I miss the necessary 3 extra columns. They can't be used to determine unique records because it is dissimular data. However, when all is said and done, I still need those 3 columns included in the Unique list. Max, I need to know a quick way to copy the formula down to the 22,000 row. I was dragging down and then I thought hey there must be a better way. Any tips? Thanks fands "Ron Coderre" wrote: Here's an approach pulls the unique records onto another sheet: Assumptions: Sheet1 contains your data in cells A1:C1000 Sheet2 is where you want the extracted unique data to be displayed Using Sheet2: A1: LastName B1: FirstName C1: Address InsertNameDefine Names in workbook: Sheet2!Extract Refers to: =Sheet2!$A$1:$C$1 Still using Sheet2: InsertNameDefine Names in workbook: Sheet2!Database Refers to: =Sheet1!$A$1:$C$1000 (Notice: you are on Sheet2, and creating a Sheet2-level range name, but the referenced range is on Sheet1) The reason: An advanced filter cannot SEND data to another sheet, but it can PULL data FROM another sheet. Now...set up the Advanced Data Filter: <Data<Filter<Advanced Filter Select: Copy to another location List Range: (press F3 and select Database) Criteria Range: (none for this exercise) Copy To: (press F3 and select Extract) Check: Unique records only Click [OK] Note: if you want to run that Advanced Data Filter again, you'll need to re-select the range "Database" each time Is that something you can work with? *********** Regards, Ron XL2002, WinXP "Northwoods" wrote: I have a client history that includes similar (Name, address) and dissimilar (date) information. I want to extract duplicate records to another worksheet and leave only 1 record per client on the other. Problem is querying only the first 3 columns, LastName, FirstName & Address and ignoring the remaining 3 columns that have dissimilar information and the system would consider unique. There are 22,000 records. Any ideas most appreciated |
#8
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract duplicate rows to another worksheet
"Northwoods" wrote
.. Max, I need to know a quick way to copy the formula down to the 22,000 row. I was dragging down and then I thought hey there must be a better way. Any tips? 99% of the time, I'd just simply copy the formula/s by dragging down. Anyway, here's an alternative to dragging down. Since we're dealing with large ranges and calc intensive formulas here, as a precaution, I'd set the book's calc mode to "Manual" first, via clicking Tools Options Calculation tab Check "Manual" OK. Eg for: Put in A2: =IF(COUNTBLANK(X!A2:C2)=3,"",IF(SUMPRODUCT((X!$A$2 :A2=X!A2)*(X!$B$2:B2=X!B2)*(X!$C$2:C2=X!C2))<2,ROW (),"")) Copy A2 down to cover the max expected extent of source data, say down to A22200. Type the required range in the namebox, viz.: A2:A22200 Press ENTER. This will select the range with A2 active Then with range selected, copy & paste the formula directly into the formula bar (this will be for A2, the active cell) and press CTRL+ENTER (instead of just pressing ENTER). This will fill the formulas for the entire range A2:A22200 As for the other fill down for a multi-col range .. Then put in B2: =IF(ROW(A1)COUNT($A:$A),"",INDEX(X!A:A,MATCH(SMAL L($A:$A,ROW(A1)),$A:$A,0))) Copy B2 to G2. Select B2:G2, fill down by the smallest extent sufficient to cover the max expected number of unique lines from X, say down to G5000. After copying B2 to G2, just right-click to copy the range B2:G2 Then type the reqd range in the name box, ie: B3:G5000 press ENTER (B3 will be active). Then right-click on the selected range, choose Paste special check "Formulas" OK. And after filling all of the above / whenever required, we should remember to press F9 to calculate. -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Too Many Rows For Worksheet | New Users to Excel | |||
Too Many Rows For Worksheet | New Users to Excel | |||
XLS Worksheet 'Select All' button + top 2 rows not in view in wind | Excel Discussion (Misc queries) | |||
copy rows meeting criteria to another worksheet | Excel Worksheet Functions | |||
In a protected worksheet allow users to delete rows | Excel Worksheet Functions |