Home |
Search |
Today's Posts |
#1
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
Does anyone know a way to delete rows with duplicate data in a document with
over 60,000 rows? Some data is duplicated as much as 10 times and some as few as once or not at all. -- Thanks, David |
#2
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
Visit Chip at www.cpearson.com.
He has lots of good stuff on the subject of duplicates best wishes -- Bernard V Liengme Microsoft Excel MVP http://people.stfx.ca/bliengme remove caps from email "David" wrote in message ... Does anyone know a way to delete rows with duplicate data in a document with over 60,000 rows? Some data is duplicated as much as 10 times and some as few as once or not at all. -- Thanks, David |
#3
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
try using Advance Filter
1. Go to Data on the menu bar 2. Select Filter 3. Chosse Advance Filter 4. check Copy to another location ( to keep the original data ) 5. Select the range 6. Select a cell on the same worksheet which is out of the existing data range 7. check the Unique records only 8. click OK HTH -- Your feedback is very much appreciate, pls click on the Yes button below if this posting is helpful. Thank You cheers, francis "David" wrote: Does anyone know a way to delete rows with duplicate data in a document with over 60,000 rows? Some data is duplicated as much as 10 times and some as few as once or not at all. -- Thanks, David |
#4
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
Try some code like the following:
Sub AAA() Dim LastRow As Long Dim TestColumn As String Dim RowNdx As Long Dim TopRow As Long Dim WS As Worksheet Dim DeleteThese As Range Set WS = ActiveSheet TestColumn = "A" '<<<< column to test for duplicates TopRow = 1 '<<<< top-most row of data to test. With WS LastRow = .Cells(.Rows.Count, TestColumn).End(xlUp).Row For RowNdx = LastRow To TopRow Step -1 If Application.CountIf(.Range(.Cells(TopRow, TestColumn), _ .Cells(RowNdx, TestColumn)), _ .Cells(RowNdx, TestColumn)) 1 Then If DeleteThese Is Nothing Then Set DeleteThese = .Rows(RowNdx) Else Set DeleteThese = _ Application.Union(DeleteThese, .Rows(RowNdx)) End If End If Next RowNdx End With If Not DeleteThese Is Nothing Then DeleteThese.Delete End If End Sub Change TestColumn to the letter of the column that is to be used to test for duplicates. Change TopRow to a value 1 if your worksheet has some header rows that should not be tested. Cordially, Chip Pearson Microsoft Most Valuable Professional Excel Product Group, 1998 - 2009 Pearson Software Consulting, LLC www.cpearson.com (email on web site) On Mon, 19 Jan 2009 05:53:02 -0800, David wrote: Does anyone know a way to delete rows with duplicate data in a document with over 60,000 rows? Some data is duplicated as much as 10 times and some as few as once or not at all. |
#5
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
For Sort BD:
Sub DeleteDuplicateClassic() Application.ScreenUpdating = False Application.Calculation = xlCalculationManual [A1].Sort Key1:=Range("A2"), Order1:=xlAscending, Header:=xlGuess For i = [A65000].End(xlUp).Row To 2 Step -1 If Cells(i, 1) = Cells(i - 1, 1) Then Rows(i).Delete Next i Application.Calculation = xlCalculationAutomatic End Sub Respect Order:2 s for 10.0000 rows and suppress 5% Sub RespectOrderDictionary() Set MonDico = CreateObject("Scripting.Dictionary") Application.ScreenUpdating = False i = 2 Do While Cells(i, "A") < "" If Not MonDico.Exists(Cells(i, "A") & Cells(i, "C")) Then MonDico.Add Cells(i, "A") & Cells(i, "C"), Cells(i, "A") & Cells(i, "C") i = i + 1 Else Rows(i).EntireRow.Delete End If Loop End Sub 1,17 sec for 10.000 rows and 80% suppress: Sub DeleteDuplicateQuick() t = Timer() Application.ScreenUpdating = False [A1].Sort Key1:=Range("A2"), Order1:=xlAscending, _ Header:=xlGuess Columns("b:b").Insert Shift:=xlToRight [B1] = "ColB" [B2].FormulaR1C1 = "=IF(RC[-1]=R[-1]C[-1],1,0)" [B2].AutoFill Destination:=Range("B2:B" & [A65000].End(xlUp).Row) [B:B].Value = [B:B].Value [A2].CurrentRegion.Sort Key1:=Range("B2"), Order1:=xlAscending, Header:=xlGuess [B:B].Replace What:="1", Replacement:="", LookAt:=xlPart Range("B2:B65000").SpecialCells(xlCellTypeBlanks). EntireRow.Delete Columns("b:b").Delete Shift:=xlToLeft MsgBox Timer() - t End Sub http://cjoint.com/?bvpZzDBano JB http://boisgontierjacques.free.fr On 19 jan, 14:53, David wrote: Does anyone know a way to delete rows with duplicate data in a document with over 60,000 rows? *Some data is duplicated as much as 10 times and some as few as once or not at all. -- Thanks, David |
#6
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
Thanks for the responses everyone. Because I don't know anything about VBA
I'm going to try this option first. One follow up question. Is there a way to tell how many rows were filtered or how many remain? -- Thanks, David "xlmate" wrote: try using Advance Filter 1. Go to Data on the menu bar 2. Select Filter 3. Chosse Advance Filter 4. check Copy to another location ( to keep the original data ) 5. Select the range 6. Select a cell on the same worksheet which is out of the existing data range 7. check the Unique records only 8. click OK HTH -- Your feedback is very much appreciate, pls click on the Yes button below if this posting is helpful. Thank You cheers, francis "David" wrote: Does anyone know a way to delete rows with duplicate data in a document with over 60,000 rows? Some data is duplicated as much as 10 times and some as few as once or not at all. -- Thanks, David |
#7
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]()
Hi David,
You may also try this formula I am assuming that your range data is in Col A and duplicates can be found in the same Col. Change the range to yours =IF(COUNTIF($A$2:A2,A2)1,"TRUE","") copy this formula and select a cell, let C2, Ctrl-V into the formula bar. Copy as far down as your data is. Filter for Blanks in Col C, Excel will show all entries with a blank in Col C Copy this set of data to a new sheet. or you can filter for "TRUE" in col C and delete all entries with the word "TRUE" either way will give you a set of unique data range. pls note to save a back up before doing any of the suggestions. To count the number of entries, use =count(A:A) HTH -- Your feedback is very much appreciate, pls click on the Yes button below if this posting is helpful. Thank You cheers, francis "David" wrote: Thanks for the responses everyone. Because I don't know anything about VBA I'm going to try this option first. One follow up question. Is there a way to tell how many rows were filtered or how many remain? -- Thanks, David "xlmate" wrote: try using Advance Filter 1. Go to Data on the menu bar 2. Select Filter 3. Chosse Advance Filter 4. check Copy to another location ( to keep the original data ) 5. Select the range 6. Select a cell on the same worksheet which is out of the existing data range 7. check the Unique records only 8. click OK HTH -- Your feedback is very much appreciate, pls click on the Yes button below if this posting is helpful. Thank You cheers, francis "David" wrote: Does anyone know a way to delete rows with duplicate data in a document with over 60,000 rows? Some data is duplicated as much as 10 times and some as few as once or not at all. -- Thanks, David |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Find and Delete Duplicate entries | Excel Discussion (Misc queries) | |||
How do I delete duplicate entries in excel? | New Users to Excel | |||
find duplicate entries and delete them? | Excel Worksheet Functions | |||
How do I delete duplicate entries? | Excel Worksheet Functions | |||
Add numbers for duplicate entries then delete | Excel Worksheet Functions |