Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 12
Default Extract duplicate rows to another worksheet

I have a client history that includes similar (Name, address) and dissimilar
(date) information. I want to extract duplicate records to another worksheet
and leave only 1 record per client on the other. Problem is querying only the
first 3 columns, LastName, FirstName & Address and ignoring the remaining 3
columns that have dissimilar information and the system would consider unique.
There are 22,000 records.
Any ideas most appreciated

  #2   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 2,118
Default Extract duplicate rows to another worksheet

Here's an approach pulls the unique records onto another sheet:

Assumptions:
Sheet1 contains your data in cells A1:C1000
Sheet2 is where you want the extracted unique data to be displayed

Using Sheet2:
A1: LastName
B1: FirstName
C1: Address

InsertNameDefine
Names in workbook: Sheet2!Extract
Refers to: =Sheet2!$A$1:$C$1

Still using Sheet2:
InsertNameDefine
Names in workbook: Sheet2!Database
Refers to: =Sheet1!$A$1:$C$1000

(Notice: you are on Sheet2, and creating a Sheet2-level range name, but
the referenced range is on Sheet1)

The reason: An advanced filter cannot SEND data to another sheet, but
it can PULL data FROM another sheet.

Now...set up the Advanced Data Filter:
<Data<Filter<Advanced Filter
Select: Copy to another location
List Range: (press F3 and select Database)
Criteria Range: (none for this exercise)
Copy To: (press F3 and select Extract)
Check: Unique records only
Click [OK]

Note: if you want to run that Advanced Data Filter again,
you'll need to re-select the range "Database" each time

Is that something you can work with?
***********
Regards,
Ron

XL2002, WinXP


"Northwoods" wrote:

I have a client history that includes similar (Name, address) and dissimilar
(date) information. I want to extract duplicate records to another worksheet
and leave only 1 record per client on the other. Problem is querying only the
first 3 columns, LastName, FirstName & Address and ignoring the remaining 3
columns that have dissimilar information and the system would consider unique.
There are 22,000 records.
Any ideas most appreciated

  #3   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Max Max is offline
external usenet poster
 
Posts: 9,221
Default Extract duplicate rows to another worksheet

Just another option to play with ..

Assume source data is in a sheet named: X
within cols A to G, data from row2 down to row 22000
where cols A to C are the key cols: LastName, FirstName & Address

In a new sheet named: U (for "Uniques", say)
Paste the 6 col headers from X's A1:F1 into B1:G1

Put in A2:
=IF(COUNTBLANK(X!A2:C2)=3,"",IF(SUMPRODUCT((X!$A$2 :A2=X!A2)*(X!$B$2:B2=X!B2)*(X!$C$2:C2=X!C2))<2,ROW (),""))
Copy A2 down to cover the max expected extent of source data, say down to
A22200. (Leave A1 blank)

Then put in B2:
=IF(ROW(A1)COUNT($A:$A),"",INDEX(X!A:A,MATCH(SMAL L($A:$A,ROW(A1)),$A:$A,0)))
Copy B2 to G2. Select B2:G2, fill down by the smallest extent sufficient to
cover the max expected number of unique lines from X, say down to G5000. When
the entire uniques extract is exhausted you'd see only "blank" rows. Cols B
to G will extract only the unique* lines from X, with all results neatly
bunched at the top. Hide away col A if desired.
*ie lines in X with unique: LastName - FirstName - Address

Then dress up U nicely, make a copy of it, rename the copy as say: D (for
"Duplicates").

In sheet: D,

Replace the formula in A2 by:
=IF(COUNTBLANK(X!A2:C2)=3,"",IF(U!A2="",ROW(),""))
then copy A2 down to A22200 as before. (Leave A1 blank)

Select B2:G2 (same formulas - no change), fill down by the smallest extent
sufficient to cover the max expected "balance" duplicate lines from X, say
down to G17200 (assuming 5,000 max unique lines, balance duplicates expected
is 22200 - 5000 = 17200 roughly). When the entire duplicates extract is
exhausted you'd see only "blank" rows, as in the uniques sheet U. Cols B to G
will extract only the duplicate lines from X, with all results neatly bunched
at the top. Hide away col A if desired.

Easiest way to try out the above is to simply rename a copy of your source
sheet as: X, ensure the data structure is as assumed, then plug-in the
formulas "as-is", and power away. You can always restore the source sheet's
name later and leave it to Excel to auto-change the formulas to suit.
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---
"Northwoods" wrote:
I have a client history that includes similar (Name, address) and dissimilar
(date) information. I want to extract duplicate records to another worksheet
and leave only 1 record per client on the other. Problem is querying only the
first 3 columns, LastName, FirstName & Address and ignoring the remaining 3
columns that have dissimilar information and the system would consider unique.
There are 22,000 records.
Any ideas most appreciated

  #4   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 12
Default Extract duplicate rows to another worksheet

Thanks everyone,
I am going to get to work and post the results.
Northwoods

"Max" wrote:

Just another option to play with ..

Assume source data is in a sheet named: X
within cols A to G, data from row2 down to row 22000
where cols A to C are the key cols: LastName, FirstName & Address

In a new sheet named: U (for "Uniques", say)
Paste the 6 col headers from X's A1:F1 into B1:G1

Put in A2:
=IF(COUNTBLANK(X!A2:C2)=3,"",IF(SUMPRODUCT((X!$A$2 :A2=X!A2)*(X!$B$2:B2=X!B2)*(X!$C$2:C2=X!C2))<2,ROW (),""))
Copy A2 down to cover the max expected extent of source data, say down to
A22200. (Leave A1 blank)

Then put in B2:
=IF(ROW(A1)COUNT($A:$A),"",INDEX(X!A:A,MATCH(SMAL L($A:$A,ROW(A1)),$A:$A,0)))
Copy B2 to G2. Select B2:G2, fill down by the smallest extent sufficient to
cover the max expected number of unique lines from X, say down to G5000. When
the entire uniques extract is exhausted you'd see only "blank" rows. Cols B
to G will extract only the unique* lines from X, with all results neatly
bunched at the top. Hide away col A if desired.
*ie lines in X with unique: LastName - FirstName - Address

Then dress up U nicely, make a copy of it, rename the copy as say: D (for
"Duplicates").

In sheet: D,

Replace the formula in A2 by:
=IF(COUNTBLANK(X!A2:C2)=3,"",IF(U!A2="",ROW(),""))
then copy A2 down to A22200 as before. (Leave A1 blank)

Select B2:G2 (same formulas - no change), fill down by the smallest extent
sufficient to cover the max expected "balance" duplicate lines from X, say
down to G17200 (assuming 5,000 max unique lines, balance duplicates expected
is 22200 - 5000 = 17200 roughly). When the entire duplicates extract is
exhausted you'd see only "blank" rows, as in the uniques sheet U. Cols B to G
will extract only the duplicate lines from X, with all results neatly bunched
at the top. Hide away col A if desired.

Easiest way to try out the above is to simply rename a copy of your source
sheet as: X, ensure the data structure is as assumed, then plug-in the
formulas "as-is", and power away. You can always restore the source sheet's
name later and leave it to Excel to auto-change the formulas to suit.
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---
"Northwoods" wrote:
I have a client history that includes similar (Name, address) and dissimilar
(date) information. I want to extract duplicate records to another worksheet
and leave only 1 record per client on the other. Problem is querying only the
first 3 columns, LastName, FirstName & Address and ignoring the remaining 3
columns that have dissimilar information and the system would consider unique.
There are 22,000 records.
Any ideas most appreciated

  #5   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Max Max is offline
external usenet poster
 
Posts: 9,221
Default Extract duplicate rows to another worksheet

Thanks for posting back !
Let us know how it went for you
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---
"Northwoods" wrote in message
...
Thanks everyone,
I am going to get to work and post the results.
Northwoods





  #6   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 12
Default Extract duplicate rows to another worksheet

Hi Ron, your solution worked great but I miss the necessary 3 extra columns.
They can't be used to determine unique records because it is dissimular data.
However, when all is said and done, I still need those 3 columns included in
the Unique list.
Max, I need to know a quick way to copy the formula down to the 22,000 row.
I was dragging down and then I thought hey there must be a better way. Any
tips?
Thanks
fands

"Ron Coderre" wrote:

Here's an approach pulls the unique records onto another sheet:

Assumptions:
Sheet1 contains your data in cells A1:C1000
Sheet2 is where you want the extracted unique data to be displayed

Using Sheet2:
A1: LastName
B1: FirstName
C1: Address

InsertNameDefine
Names in workbook: Sheet2!Extract
Refers to: =Sheet2!$A$1:$C$1

Still using Sheet2:
InsertNameDefine
Names in workbook: Sheet2!Database
Refers to: =Sheet1!$A$1:$C$1000

(Notice: you are on Sheet2, and creating a Sheet2-level range name, but
the referenced range is on Sheet1)

The reason: An advanced filter cannot SEND data to another sheet, but
it can PULL data FROM another sheet.

Now...set up the Advanced Data Filter:
<Data<Filter<Advanced Filter
Select: Copy to another location
List Range: (press F3 and select Database)
Criteria Range: (none for this exercise)
Copy To: (press F3 and select Extract)
Check: Unique records only
Click [OK]

Note: if you want to run that Advanced Data Filter again,
you'll need to re-select the range "Database" each time

Is that something you can work with?
***********
Regards,
Ron

XL2002, WinXP


"Northwoods" wrote:

I have a client history that includes similar (Name, address) and dissimilar
(date) information. I want to extract duplicate records to another worksheet
and leave only 1 record per client on the other. Problem is querying only the
first 3 columns, LastName, FirstName & Address and ignoring the remaining 3
columns that have dissimilar information and the system would consider unique.
There are 22,000 records.
Any ideas most appreciated

  #7   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 2,118
Default Extract duplicate rows to another worksheet

I need a little more information....

A client has 5 records, all with identical LastName, FirstName, and Address
values.
I'm guessing that the other fields contain other information, different for
each record.

Of those 5 records, what deteremines which record contains the data in the
other fields that you want associated with the "unique" record?

***********
Regards,
Ron

XL2002, WinXP


"Northwoods" wrote:

Hi Ron, your solution worked great but I miss the necessary 3 extra columns.
They can't be used to determine unique records because it is dissimular data.
However, when all is said and done, I still need those 3 columns included in
the Unique list.
Max, I need to know a quick way to copy the formula down to the 22,000 row.
I was dragging down and then I thought hey there must be a better way. Any
tips?
Thanks
fands

"Ron Coderre" wrote:

Here's an approach pulls the unique records onto another sheet:

Assumptions:
Sheet1 contains your data in cells A1:C1000
Sheet2 is where you want the extracted unique data to be displayed

Using Sheet2:
A1: LastName
B1: FirstName
C1: Address

InsertNameDefine
Names in workbook: Sheet2!Extract
Refers to: =Sheet2!$A$1:$C$1

Still using Sheet2:
InsertNameDefine
Names in workbook: Sheet2!Database
Refers to: =Sheet1!$A$1:$C$1000

(Notice: you are on Sheet2, and creating a Sheet2-level range name, but
the referenced range is on Sheet1)

The reason: An advanced filter cannot SEND data to another sheet, but
it can PULL data FROM another sheet.

Now...set up the Advanced Data Filter:
<Data<Filter<Advanced Filter
Select: Copy to another location
List Range: (press F3 and select Database)
Criteria Range: (none for this exercise)
Copy To: (press F3 and select Extract)
Check: Unique records only
Click [OK]

Note: if you want to run that Advanced Data Filter again,
you'll need to re-select the range "Database" each time

Is that something you can work with?
***********
Regards,
Ron

XL2002, WinXP


"Northwoods" wrote:

I have a client history that includes similar (Name, address) and dissimilar
(date) information. I want to extract duplicate records to another worksheet
and leave only 1 record per client on the other. Problem is querying only the
first 3 columns, LastName, FirstName & Address and ignoring the remaining 3
columns that have dissimilar information and the system would consider unique.
There are 22,000 records.
Any ideas most appreciated

  #8   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Max Max is offline
external usenet poster
 
Posts: 9,221
Default Extract duplicate rows to another worksheet

"Northwoods" wrote
.. Max, I need to know a quick way to copy the formula
down to the 22,000 row. I was dragging down and then
I thought hey there must be a better way. Any tips?


99% of the time, I'd just simply copy the formula/s by dragging down.

Anyway, here's an alternative to dragging down. Since we're dealing with
large ranges and calc intensive formulas here, as a precaution, I'd set the
book's calc mode to "Manual" first, via clicking Tools Options
Calculation tab Check "Manual" OK.

Eg for:

Put in A2:
=IF(COUNTBLANK(X!A2:C2)=3,"",IF(SUMPRODUCT((X!$A$2 :A2=X!A2)*(X!$B$2:B2=X!B2)*(X!$C$2:C2=X!C2))<2,ROW (),""))
Copy A2 down to cover the max expected extent of source data,
say down to A22200.


Type the required range in the namebox, viz.: A2:A22200
Press ENTER. This will select the range with A2 active

Then with range selected, copy & paste the formula directly into the formula
bar (this will be for A2, the active cell) and press CTRL+ENTER (instead of
just pressing ENTER). This will fill the formulas for the entire range
A2:A22200

As for the other fill down for a multi-col range ..

Then put in B2:
=IF(ROW(A1)COUNT($A:$A),"",INDEX(X!A:A,MATCH(SMAL L($A:$A,ROW(A1)),$A:$A,0)))
Copy B2 to G2. Select B2:G2, fill down by the smallest extent sufficient
to
cover the max expected number of unique lines from X, say down to G5000.


After copying B2 to G2, just right-click to copy the range B2:G2
Then type the reqd range in the name box, ie: B3:G5000
press ENTER (B3 will be active). Then right-click on the selected range,
choose Paste special check "Formulas" OK.

And after filling all of the above / whenever required, we should remember
to press F9 to calculate.
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---


Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Too Many Rows For Worksheet Gord Dibben New Users to Excel 3 July 28th 06 01:49 PM
Too Many Rows For Worksheet Peo Sjoblom New Users to Excel 0 July 27th 06 10:46 PM
XLS Worksheet 'Select All' button + top 2 rows not in view in wind JoB Excel Discussion (Misc queries) 1 June 17th 06 01:14 AM
copy rows meeting criteria to another worksheet confused Excel Worksheet Functions 4 October 4th 05 11:51 AM
In a protected worksheet allow users to delete rows Jason Trivett Excel Worksheet Functions 1 July 12th 05 09:50 AM


All times are GMT +1. The time now is 10:07 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"