LinkBack Thread Tools Search this Thread Display Modes
Prev Previous Post   Next Post Next
  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 12
Default How can I delete 'fuzzy' duplicated rows?

Hi Toppers
have tried it but Im getting Compile error 'End if without block if'
Any suggestions please! Im lost.
Thanks
Ani

"Toppers" wrote:

Hi,

Sort data (PostCode,Surname,Title) and then run this (TEST (data) first!).
It checks ALL three fields and deletes any duplicate rows.


HTH

Sub DeleteDups()

Dim ws1 As Worksheet
Dim lastrow As Long
Dim r As Long

Set ws1 = Worksheets("Sheet1")
With ws1
lastrow = .Cells(Rows.Count, 1).End(xlUp).Row
For r = lastrow To 2 Step -1
If Application.And(.Cells(r, "A") = .Cells(r - 1, "A"), .Cells(r, "C") =
.Cells(r - 1, "C"), .Cells(r, "J") = .Cells(r - 1, "J")) Then
.Rows(r).Delete shift:=xlUp
End If
Next r
End With

End Sub


"ani_unicorn" wrote:

Hi John
thanks for the reply. Im not really sure what you mean. If it helps, my
data is 45,000 rows of addresses. The columns run from a-j. I have split
the file into different books for now to make it more manageable.
The column with the post codes in are exact so I can format to recognise
these. Where I have the problem is in the column containing the road number
and name of road. The addresses have been typed in from sheets that often
give more than one house number but the same road name for an individual
surname. The data is then exactly duplicated on columns A C and J but may
vary in details in the other columns. As such:

Title Initial Surname Hse Name Add1 Add2 Add3 Add4 Add4 Add5 Post Code

I know what I want to do but dont know how. I want to auto delete the whole
of the row with the duplicated post code and corresponding surname in. At
the moment I am deleting manually and I am tearing my hair out! I have used
ad filter to delete the unique records but am now stuck!
Hope your Christmas shopping wasnt too much of a nightmare!
Many thanks, and happy holidays
Ani

"John Coleman" wrote:


ani_unicorn wrote:
I have approximately 45,000 entries that are a mix of text and numerics. I
need to set a formula, macro or something that will help me delete 'fuzzy'or
non-identical rows that are duplicated. ie some rows have typos, or other
similar mistakes.
I can use the unique filter and other formulas to find identical cells but
need to know how to find and delete the rows with odd characters in them.
Please help!
Thanks
Ani

It would be relatively easy to write a function that takes as input two
row indices and returns the percentage of columns in which the two rows
agree. You would need to look at your data and decide how much
agreement constitutes a virtual duplicate. Once you have this function,
then a nested loop can be written which deletes rows if they match
agree with a previous row beyond a certain threshhold. I got to do some
last minute christmas shopping now so don't have time to write any
actual code.

Hope that helps

-John Coleman




 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Delete rows with duplicated values Emece Excel Discussion (Misc queries) 5 February 22nd 10 12:36 PM
Delete rows with duplicated values Emece Excel Discussion (Misc queries) 2 April 16th 09 12:52 PM
How can I delete duplicated data tweacle[_5_] Excel Worksheet Functions 1 December 20th 07 12:59 AM
How to delete duplicated values in each row ??? Oleg Excel Discussion (Misc queries) 2 March 8th 06 12:57 PM
Delete non duplicated rows cape Excel Discussion (Misc queries) 1 July 8th 05 02:50 AM


All times are GMT +1. The time now is 04:43 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"