View Single Post
  #5   Report Post  
Posted to microsoft.public.excel.misc
WMark WMark is offline
external usenet poster
 
Posts: 3
Default How do I identify matching rows of data and delete them both?

Smatgal,

Thanks for this suggestion. I think this will give me what I'm looking for,
but I'm such a novice at Excel that I have more questions. Allow me to
explain my data in more detail and let's see if I can apply what you
recommend:

- these two spreadsheets represent individual sales transactions (about
7,000 total, so you can see why I don't want to do this by hand). The tables
are produced by two different accounting systems in our company, and they
don't agree (which ultimately affects the paycheck of several people,
including me). One of the tables (let's call it Table A) shows significantly
more transactions than the other (Table B), so I'm trying to identify the
transactions that are unique to Table A. The transactions in Table B
represent what's already been paid on (in commissions), so the important data
is the set of unique transaction in Table B. This means that I have to
compare the two tables and strip the rows out of Table A that already have a
match in Table B.

The columns are not completely the same, but there are some common elements
and I can strip out the rest. The common elements are transaction date,
serial number, and account number.

So, given your instructions, here are some additional questions:

1. In your para 1, I already have the data sorted by transaction date, so
all I need to do is add a column for count, each with a value of 1. I'm
assuming this is working on a merged table that contains both Table A and
Table B, correct?

2. I'm new to pivot tables, but it sounds straightforward. What I end up
with is a count of how many times each transaction appears. I guess I can
then sort by the count, and delete everything greater than 1.

3. I've never used vlookup, so I'll have to read the Excel help file to
figure out how that's used.

I think I can get this to work, but let me know if I misunderstood anything.
Thanks again for your help!

W Mark




"smartgal" wrote:

You might also try a pivot table. Add a column to identify the source of
each set of data. Say you're doing a table of employees and comparing Sep to
Oct and you want to delete everybody who appears in both months, you would
add a column to identify the month and the Septembers will all say "Sep" and
the Octobers will all say "Oct." Then put the Oct data under the Sep data
(no blank rows between them). Also add a colum at the end and call it
"count" and give each entry a value of "1" (this will help in the pivot).

Then do the pivot table - from the pull-down menu choose Data / Pivot Table
(note that if your cursor is anywhere on the data and not outside it, it
anticipates you want the data from top to bottom). Choose Next. Then click
the "layout" tab. For our payroll comparison example, click on the employee
name field and drag it to "row" then click on the "month" field and drag it
to "row" and place it under the month. Then click on the "count" field and
drag it to "data." Click finish.

When done you'll have a pivot table that lists each employee, the month in
which they appear and the total number of times the employee appears. What
you'd be looking for here is those that only appear once (because assuming
nobody's listed in the same month two times, all the others are
month-to-month duplicates).

This might sound like a lot but it's pretty quick, actually. You could then
extract all the singles and do a vlookup of those against your master data to
delete those you don't need. Once you extract the names you want to keep,
add a column and title it "yes." Then do a vlookup on the original data
against this new list of "yes" candidates and everybody that's a no can go.
**Note that before you do this, I'd make a copy of the original data before
you delete anything so that just in case, you still have a good, unperverted
copy.

Good luck!

"WMark" wrote:

I have two worksheets...and have merged them, for now, in an attempt to
identify the rows that are common to both. Those that are common to both
worksheets I want to delete from the merged file. Does anyone know how this
can be accomplished without sorting then manually deleting the matches?

Thanks in advance,