Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 923
Default Best Method to Remove Data Rows

I have a large spreadsheet 60 columns by up to 20,000 rows. I wish to
delete all rows where a specific value does not appear in one column.

In the past I have used a loop that reads all the rows from last to first,
deleting the entire row when the value in the column does not meet the
criteria. This works great except for large files - where it is very slow.

I had thought about sorting the file and then deleting those groups of
records where the criteria is not being met but this requires extra code to
identifier each part of the file to be removed.

Another idea was to use autofilters, where the negated criteria selection
allows me to delete all visible rows.

Any ideas as to the best and fastest method, without having to trawl through
all the data, row by row?


--
Cheers
Nigel




  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 3
Default Best Method to Remove Data Rows

You might want to try turning off automatic calculations.


Nigel wrote:
I have a large spreadsheet 60 columns by up to 20,000 rows. I wish to
delete all rows where a specific value does not appear in one column.

In the past I have used a loop that reads all the rows from last to first,
deleting the entire row when the value in the column does not meet the
criteria. This works great except for large files - where it is very slow.

I had thought about sorting the file and then deleting those groups of
records where the criteria is not being met but this requires extra code to
identifier each part of the file to be removed.

Another idea was to use autofilters, where the negated criteria selection
allows me to delete all visible rows.

Any ideas as to the best and fastest method, without having to trawl through
all the data, row by row?


--
Cheers
Nigel


  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 923
Default Best Method to Remove Data Rows

It's a good tip, but I already have that in my existing code. In my
spreadsheet most of the data is passive (very few formulas), but there are
some overall counters using subtotal that when switched off speed up the
deletions, but only slightly.

--
Cheers
Nigel



"Antchi" wrote in message
oups.com...
You might want to try turning off automatic calculations.


Nigel wrote:
I have a large spreadsheet 60 columns by up to 20,000 rows. I wish to
delete all rows where a specific value does not appear in one column.

In the past I have used a loop that reads all the rows from last to
first,
deleting the entire row when the value in the column does not meet the
criteria. This works great except for large files - where it is very
slow.

I had thought about sorting the file and then deleting those groups of
records where the criteria is not being met but this requires extra code
to
identifier each part of the file to be removed.

Another idea was to use autofilters, where the negated criteria selection
allows me to delete all visible rows.

Any ideas as to the best and fastest method, without having to trawl
through
all the data, row by row?


--
Cheers
Nigel




  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 24
Default Best Method to Remove Data Rows


Filter your data for the rows you want to delete and then run the
following code

Sub Remove_Data()

Dim rng As Range
Dim rng2 As Range

With ActiveSheet.AutoFilter.Range
On Error Resume Next
Set rng2 = .Offset(1, 0).Resize(.Rows.Count - 1, 1) _
.SpecialCells(xlCellTypeVisible)
On Error GoTo 0
End With

If rng2 Is Nothing Then
' Do Nothing
Else ' Delete the rows
Set rng = ActiveSheet.AutoFilter.Range
rng.Offset(1, 0).Resize(rng.Rows.Count - 1).Delete
End If

ActiveSheet.AutoFilterMode = False

End Sub

  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,441
Default Best Method to Remove Data Rows

Nigel,

Both stepping through the rows and using the autofilter method will be slow with large workbooks, as
Excel needs to move all that data around every time a row or group of rows is deleted. You could
try this, which I have found to be faster: it sorts twice, each time getting rid of the rows above
the desired value, so Excel is only deleting two blocks of rows. Note that you didn't say what your
criteria for saving was: hence the use of a variant. This assumes your data starts in row 2 with a
header in row 1. Give it a try - and I would be interested in hearing if it made much of a
difference in your case.

HTH,
Bernie
MS Excel MVP

Sub Macro1()
Dim myCell As Range
Dim myCol As Integer
Dim WhatToSave As Variant
Dim myCalc As Variant

WhatToSave = "Criteria for saving"
myCol = 3

With Application
.ScreenUpdating = False
myCalc = .Calculation
.Calculation = xlCalculationManual
.EnableEvents = False
End With

Columns(myCol).Sort Key1:=Cells(2, myCol), Order1:=xlAscending, Header:=xlYes
Set myCell = Columns(myCol).Find(What:=WhatToSave, After:=Cells(1, myCol), LookAt:=xlWhole)
If myCell.Row < 2 Then Range(Cells(2, myCol), myCell(0, 1)).EntireRow.Delete

Columns(myCol).Sort Key1:=Cells(2, myCol), Order1:=xlDescending, Header:=xlYes
Set myCell = Columns(myCol).Find(What:=WhatToSave, After:=Cells(1, myCol), LookAt:=xlWhole)
If myCell.Row < 2 Then Range(Cells(2, myCol), myCell(0, 1)).EntireRow.Delete

With Application
.ScreenUpdating = True
.Calculation = myCalc
.EnableEvents = True
End With

End Sub



"Nigel" wrote in message
...
I have a large spreadsheet 60 columns by up to 20,000 rows. I wish to delete all rows where a
specific value does not appear in one column.

In the past I have used a loop that reads all the rows from last to first, deleting the entire row
when the value in the column does not meet the criteria. This works great except for large
files - where it is very slow.

I had thought about sorting the file and then deleting those groups of records where the criteria
is not being met but this requires extra code to identifier each part of the file to be removed.

Another idea was to use autofilters, where the negated criteria selection allows me to delete all
visible rows.

Any ideas as to the best and fastest method, without having to trawl through all the data, row by
row?


--
Cheers
Nigel








  #6   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 923
Default Best Method to Remove Data Rows

Thanks for the code, this is similar to the method I developed that uses a
filter selection and deletes the rest. But although faster than the scan
all rows method it is still quite slow with my worksheet..

But it is an improvement!

--
Cheers
Nigel



wrote in message
oups.com...

Filter your data for the rows you want to delete and then run the
following code

Sub Remove_Data()

Dim rng As Range
Dim rng2 As Range

With ActiveSheet.AutoFilter.Range
On Error Resume Next
Set rng2 = .Offset(1, 0).Resize(.Rows.Count - 1, 1) _
.SpecialCells(xlCellTypeVisible)
On Error GoTo 0
End With

If rng2 Is Nothing Then
' Do Nothing
Else ' Delete the rows
Set rng = ActiveSheet.AutoFilter.Range
rng.Offset(1, 0).Resize(rng.Rows.Count - 1).Delete
End If

ActiveSheet.AutoFilterMode = False

End Sub



  #7   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 923
Default Best Method to Remove Data Rows

Thanks for the tip, I had come to the same conclusion that using sort was
the best and fastest method. What I had not considered was using two sorts
to create the two blocks of data, neat trick. This works the fastest of the
three methods I have explored.

Thanks for the advice

--
Cheers
Nigel



"Bernie Deitrick" <deitbe @ consumer dot org wrote in message
...
Nigel,

Both stepping through the rows and using the autofilter method will be
slow with large workbooks, as Excel needs to move all that data around
every time a row or group of rows is deleted. You could try this, which I
have found to be faster: it sorts twice, each time getting rid of the rows
above the desired value, so Excel is only deleting two blocks of rows.
Note that you didn't say what your criteria for saving was: hence the use
of a variant. This assumes your data starts in row 2 with a header in row
1. Give it a try - and I would be interested in hearing if it made much
of a difference in your case.

HTH,
Bernie
MS Excel MVP

Sub Macro1()
Dim myCell As Range
Dim myCol As Integer
Dim WhatToSave As Variant
Dim myCalc As Variant

WhatToSave = "Criteria for saving"
myCol = 3

With Application
.ScreenUpdating = False
myCalc = .Calculation
.Calculation = xlCalculationManual
.EnableEvents = False
End With

Columns(myCol).Sort Key1:=Cells(2, myCol), Order1:=xlAscending,
Header:=xlYes
Set myCell = Columns(myCol).Find(What:=WhatToSave, After:=Cells(1, myCol),
LookAt:=xlWhole)
If myCell.Row < 2 Then Range(Cells(2, myCol), myCell(0,
1)).EntireRow.Delete

Columns(myCol).Sort Key1:=Cells(2, myCol), Order1:=xlDescending,
Header:=xlYes
Set myCell = Columns(myCol).Find(What:=WhatToSave, After:=Cells(1, myCol),
LookAt:=xlWhole)
If myCell.Row < 2 Then Range(Cells(2, myCol), myCell(0,
1)).EntireRow.Delete

With Application
.ScreenUpdating = True
.Calculation = myCalc
.EnableEvents = True
End With

End Sub



"Nigel" wrote in message
...
I have a large spreadsheet 60 columns by up to 20,000 rows. I wish to
delete all rows where a specific value does not appear in one column.

In the past I have used a loop that reads all the rows from last to
first, deleting the entire row when the value in the column does not meet
the criteria. This works great except for large files - where it is very
slow.

I had thought about sorting the file and then deleting those groups of
records where the criteria is not being met but this requires extra code
to identifier each part of the file to be removed.

Another idea was to use autofilters, where the negated criteria selection
allows me to delete all visible rows.

Any ideas as to the best and fastest method, without having to trawl
through all the data, row by row?


--
Cheers
Nigel








  #8   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,441
Default Best Method to Remove Data Rows

Nigel,

You're welcome.

For any other interested parties, I compared the three methods, using a high precision timing
routine and a workbook with 22,222 rows of intermixed values, with 3,333 values that should be kept
(meaning about 19,000 rows to be deleted)

Double Sorting / deletion took between 0.1 and 0.11 seconds.

Filter / deletion took 10.42 seconds, about 100 times longer.

Stepping up through the rows took 92.85 seconds, about 1000 times longer.

Note that for each macro, I turned off screen updating, events, and set calculation to manual.

It is an interesting side note (at least to me) that it was this exact problem that got me
interested in programming Excel, almost 10 years ago. In Quattro Pro, stepping up and deleting the
rows took almost no time, but with Excel 5, it was go out and get a cup of coffee, eat breakfast,
run to the bank, etc..... obviously requiring a different technique.

HTH,
Bernie
MS Excel MVP


"Nigel" wrote in message ...
Thanks for the tip, I had come to the same conclusion that using sort was the best and fastest
method. What I had not considered was using two sorts to create the two blocks of data, neat
trick. This works the fastest of the three methods I have explored.

Thanks for the advice

--
Cheers
Nigel



"Bernie Deitrick" <deitbe @ consumer dot org wrote in message
...
Nigel,

Both stepping through the rows and using the autofilter method will be slow with large workbooks,
as Excel needs to move all that data around every time a row or group of rows is deleted. You
could try this, which I have found to be faster: it sorts twice, each time getting rid of the
rows above the desired value, so Excel is only deleting two blocks of rows. Note that you didn't
say what your criteria for saving was: hence the use of a variant. This assumes your data starts
in row 2 with a header in row 1. Give it a try - and I would be interested in hearing if it made
much of a difference in your case.

HTH,
Bernie
MS Excel MVP

Sub Macro1()
Dim myCell As Range
Dim myCol As Integer
Dim WhatToSave As Variant
Dim myCalc As Variant

WhatToSave = "Criteria for saving"
myCol = 3

With Application
.ScreenUpdating = False
myCalc = .Calculation
.Calculation = xlCalculationManual
.EnableEvents = False
End With

Columns(myCol).Sort Key1:=Cells(2, myCol), Order1:=xlAscending, Header:=xlYes
Set myCell = Columns(myCol).Find(What:=WhatToSave, After:=Cells(1, myCol), LookAt:=xlWhole)
If myCell.Row < 2 Then Range(Cells(2, myCol), myCell(0, 1)).EntireRow.Delete

Columns(myCol).Sort Key1:=Cells(2, myCol), Order1:=xlDescending, Header:=xlYes
Set myCell = Columns(myCol).Find(What:=WhatToSave, After:=Cells(1, myCol), LookAt:=xlWhole)
If myCell.Row < 2 Then Range(Cells(2, myCol), myCell(0, 1)).EntireRow.Delete

With Application
.ScreenUpdating = True
.Calculation = myCalc
.EnableEvents = True
End With

End Sub



"Nigel" wrote in message
...
I have a large spreadsheet 60 columns by up to 20,000 rows. I wish to delete all rows where a
specific value does not appear in one column.

In the past I have used a loop that reads all the rows from last to first, deleting the entire
row when the value in the column does not meet the criteria. This works great except for large
files - where it is very slow.

I had thought about sorting the file and then deleting those groups of records where the
criteria is not being met but this requires extra code to identifier each part of the file to be
removed.

Another idea was to use autofilters, where the negated criteria selection allows me to delete
all visible rows.

Any ideas as to the best and fastest method, without having to trawl through all the data, row
by row?


--
Cheers
Nigel










Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
remove duplicate rows from specific data in a column; excel 2007 DS Excel Worksheet Functions 1 August 6th 08 08:16 PM
Remove Blank and Non Data Rows UT Excel Discussion (Misc queries) 3 June 12th 07 10:01 PM
How to conditionally remove rows with data wmc Excel Worksheet Functions 0 February 27th 07 10:44 PM
Remove data from certain rows and place in new row or spreadsheet John Excel Worksheet Functions 1 October 19th 05 09:08 PM
Is there any method to remove... ªü¤T Excel Programming 12 April 27th 04 08:40 PM


All times are GMT +1. The time now is 01:57 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"