Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 3
Default Code Speed Up

I received some excellent responses regarding what I'm trying to
accomplish in the way of removing duplicates from a spreadsheet. I
have, however, found that this takes an extremely long time to finish.
I'm half way tempted to write a C program to do this as I have more
than 10,000 rows to work with per table.

One chunk for removing duplicates is found:
http://www.cpearson.com/excel/duplicat.htm

Can this be sped up?

I'm thinking that flagging all and then removing might speed things
up, but I'm not sure about this scenario.

Another chunk which actually appears to be quicker than the above,
which I modified to remove both duplicates and originals (originally
provided by Patrick Molloy) is:

Sub RemoveDupesAndOriginals()
Remove_Dupes 3
End Sub

Sub Remove_Dupes(testcol As Long)
Dim Col As Long
Dim lastrow As Long
Dim thisrow As Long
Dim lastrow2 As Long
Dim thisrow2 As Long

On Error GoTo EndMacro
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual

' get the last column, then add the row numbers
Col = Range("A1").End(xlToRight).Column + 1
' get the last row
lastrow = Range("A1").End(xlDown).Row
lastrow2 = lastrow
' add a column fro the original row order
With Range(Cells(1, Col), Cells(lastrow, Col))
.Formula = "=Row()"
.Value = .Value
End With

' sort the table by the test column
With Range(Cells(1, 1), Cells(lastrow, Col))
.Sort Cells(1, testcol)
' remove duplicate
For thisrow = lastrow To 2 Step -1
If Cells(thisrow, testcol).Value = Cells(thisrow - 1,
testcol).Value Then
Cells(thisrow - 1, testcol + 2).Value = 1
Cells(thisrow, testcol + 2).Value = 1
Rows(thisrow).Delete
End If
Next
'Delete the originals which had duplicates
For thisrow2 = lastrow2 To 2 Step -1
If Cells(thisrow2, testcol + 2).Value = 1 Then
Rows(thisrow2).Delete
End If
Next

If Cells(1, testcol + 2).Value = 1 Then
Rows(1).Delete
End If
'restore whats left to the original order
.Sort Cells(1, Col)

End With

EndMacro:

Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic

End Sub
  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 27,285
Default Code Speed Up

assume this can be determined by looking at the values in column A

Sub DeleteDups()
Dim rng As Range
Columns(2).Insert
Set rng = Range(Cells(1, 1), _
Cells(Rows.Count, 1).End(xlUp))
rng.Offset(0, 1).Formula = _
"=if(countif($A$1:A1,A1)1,na(),false)"
rng.Offset(0, 1).SpecialCells(xlFormulas, _
xlErrors).EntireRow.Delete
Columns(2).Delete
End Sub

--
Regards,
Tom Ogilvy



--
Regards,
Tom Ogilvy

"lists" wrote in message
om...
I received some excellent responses regarding what I'm trying to
accomplish in the way of removing duplicates from a spreadsheet. I
have, however, found that this takes an extremely long time to finish.
I'm half way tempted to write a C program to do this as I have more
than 10,000 rows to work with per table.

One chunk for removing duplicates is found:
http://www.cpearson.com/excel/duplicat.htm

Can this be sped up?

I'm thinking that flagging all and then removing might speed things
up, but I'm not sure about this scenario.

Another chunk which actually appears to be quicker than the above,
which I modified to remove both duplicates and originals (originally
provided by Patrick Molloy) is:

Sub RemoveDupesAndOriginals()
Remove_Dupes 3
End Sub

Sub Remove_Dupes(testcol As Long)
Dim Col As Long
Dim lastrow As Long
Dim thisrow As Long
Dim lastrow2 As Long
Dim thisrow2 As Long

On Error GoTo EndMacro
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual

' get the last column, then add the row numbers
Col = Range("A1").End(xlToRight).Column + 1
' get the last row
lastrow = Range("A1").End(xlDown).Row
lastrow2 = lastrow
' add a column fro the original row order
With Range(Cells(1, Col), Cells(lastrow, Col))
.Formula = "=Row()"
.Value = .Value
End With

' sort the table by the test column
With Range(Cells(1, 1), Cells(lastrow, Col))
.Sort Cells(1, testcol)
' remove duplicate
For thisrow = lastrow To 2 Step -1
If Cells(thisrow, testcol).Value = Cells(thisrow - 1,
testcol).Value Then
Cells(thisrow - 1, testcol + 2).Value = 1
Cells(thisrow, testcol + 2).Value = 1
Rows(thisrow).Delete
End If
Next
'Delete the originals which had duplicates
For thisrow2 = lastrow2 To 2 Step -1
If Cells(thisrow2, testcol + 2).Value = 1 Then
Rows(thisrow2).Delete
End If
Next

If Cells(1, testcol + 2).Value = 1 Then
Rows(1).Delete
End If
'restore whats left to the original order
.Sort Cells(1, Col)

End With

EndMacro:

Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic

End Sub



  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 783
Default Code Speed Up

lists wrote:

I received some excellent responses regarding what I'm trying to
accomplish in the way of removing duplicates from a spreadsheet. I
have, however, found that this takes an extremely long time to finish.
I'm half way tempted to write a C program to do this as I have more
than 10,000 rows to work with per table.

One chunk for removing duplicates is found:
http://www.cpearson.com/excel/duplicat.htm

Can this be sped up?


Generally you can get a dramatic improvement by transferring the data
from the range(s) to array(s) and looping in the arrays, then returning
the data from the array(s) to the worksheet. This would require
mounting the problem of deleting rows of an array, or flagging the rows
targeted for deletion and deleting them after returning the array data
to the worksheet. This may seem a bit daunting, but the factor of
improvement in speed of execution by looping in the array(s) is often
several hundredfold.

Alan Beban
  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 11
Default Code Speed Up


-----Original Message-----
I received some excellent responses regarding what I'm

trying to
accomplish in the way of removing duplicates from a

spreadsheet. I
have, however, found that this takes an extremely long

time to finish.
I'm half way tempted to write a C program to do this as

I have more
than 10,000 rows to work with per table.

One chunk for removing duplicates is found:
http://www.cpearson.com/excel/duplicat.htm



You could try this (uses a native Excel function for the
speed).

Sub unique_values()
'Creates a sorted list of unique values starting at Target
'Rev A 27/5/2003

'PRELIMINARIES
Dim Examine As String, Target As String, ThisPrompt As
String, title As String
Dim UserRng_A As Range, UserRng_B As Range
Dim valu As Variant

'STEP 1 DETERMINE WHERE THE RAW DATA IS
ThisPrompt = "Where is the top of the VALUES to test ? eg
A3 or B5"
title = "UNIQUE VALUES (Rev A)"
On Error Resume Next ' in case a range does not get
selected
'The use of the "Set" statement assigns the output to the
selected ActiveCell
Set UserRng_A = Application.InputBox(prompt:=ThisPrompt,
title:=title, _
Default:=ActiveCell.Address, Type:=8) '"Type 8" means a
Range result.
If UserRng_A Is Nothing Then 'input was box cancelled
MsgBox "Cancelled"
Exit Sub ' Rev A
End If

'STEP 2 DETERMINE WHERE TO PUT THE LIST
ThisPrompt = "Where is the Data to be put ?" _
& Chr(13) & Chr(13) & "You will need blank cells under
the it."
Set UserRng_B = Application.InputBox(prompt:=ThisPrompt,
title:="Select a cell", _
Default:=ActiveCell.Address, Type:=8)
If UserRng_B Is Nothing Then
MsgBox "Cancelled"
Exit Sub ' Rev A
End If
Target = UserRng_B.Address() 'the address of the selected
cell

'STEP 3 GATHER BASIC DATA
Application.ScreenUpdating = False
UserRng_A(0, 1).Select 'select the cell above
Examine = Selection.Address() 'the address of the cell
above
valu = Selection.Formula 'store the contents of the cell
one row above the first data
UserRng_A(0, 1).Formula = "temporary string" 'THE
ADVANCED FILTER DEMANDS A STRING IN THIS CELL


'STEP 4 CREATE THE UNIQUE ENTRIES
Range(Target).Clear 'needed to stop filtering falling over
Range(Examine).Activate 'filter then insert unique values
starting at Target
Range(Examine, ActiveCell.End(xlDown)).AdvancedFilter
Action:=xlFilterCopy, _
CopyToRange:=Range(Target), Unique:=True
'now sort the values
Range(Target).Select 'musn't remove this line
Range(Target, ActiveCell.End(xlDown)).Select
Selection.Sort Key1:=Range(Target), Order1:=xlAscending,
Header:=xlYes, _
OrderCustom:=1

'STEP 5 TIDY UP
UserRng_B.Formula = ""
Range(Examine).Formula = valu 'restore the original entry
to this cell
Application.ScreenUpdating = True

End Sub

Can this be sped up?

I'm thinking that flagging all and then removing might

speed things
up, but I'm not sure about this scenario.

Another chunk which actually appears to be quicker than

the above,
which I modified to remove both duplicates and originals

(originally
provided by Patrick Molloy) is:

Sub RemoveDupesAndOriginals()
Remove_Dupes 3
End Sub

Sub Remove_Dupes(testcol As Long)
Dim Col As Long
Dim lastrow As Long
Dim thisrow As Long
Dim lastrow2 As Long
Dim thisrow2 As Long

On Error GoTo EndMacro
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual

' get the last column, then add the row numbers
Col = Range("A1").End(xlToRight).Column + 1
' get the last row
lastrow = Range("A1").End(xlDown).Row
lastrow2 = lastrow
' add a column fro the original row order
With Range(Cells(1, Col), Cells(lastrow, Col))
.Formula = "=Row()"
.Value = .Value
End With

' sort the table by the test column
With Range(Cells(1, 1), Cells(lastrow, Col))
.Sort Cells(1, testcol)
' remove duplicate
For thisrow = lastrow To 2 Step -1
If Cells(thisrow, testcol).Value = Cells

(thisrow - 1,
testcol).Value Then
Cells(thisrow - 1, testcol +

2).Value = 1
Cells(thisrow, testcol + 2).Value = 1
Rows(thisrow).Delete
End If
Next
'Delete the originals which had duplicates
For thisrow2 = lastrow2 To 2 Step -1
If Cells(thisrow2, testcol + 2).Value =

1 Then
Rows(thisrow2).Delete
End If
Next

If Cells(1, testcol + 2).Value = 1 Then
Rows(1).Delete
End If
'restore whats left to the original order
.Sort Cells(1, Col)

End With

EndMacro:

Application.ScreenUpdating = True
Application.Calculation = xlCalculationAutomatic

End Sub
.

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Need to Speed Up A Code LostInNY Excel Worksheet Functions 2 July 20th 09 06:18 PM
Can you speed UP drag speed? Ryan W Excel Discussion (Misc queries) 1 October 24th 05 06:09 PM
ListView to Excel Code (but needs SPEED improvements) SVD Excel Programming 1 February 2nd 04 10:54 AM
Recalculation Speed After Editing Macro Code Bob Keating Excel Programming 1 November 16th 03 01:08 PM
Analyzing code speed mbobro[_2_] Excel Programming 1 November 3rd 03 10:05 PM


All times are GMT +1. The time now is 04:00 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"