Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 17
Default Filtering Column Duplicates

Good day,

I'm attempting to extract duplicate comma-delimited text strings (cell
character length greater that 255 - limitation of countif) found in a
column and return those results in a separate worksheet and in
separate rows for each duplicate found in the original column.

For example: (keep in mind, this example is less than 255 characters
per cell, but I need this to work for cell over 255 characters)

Sheet1

A1 : Horse,cow,pig,
A2 : Pig,cat,horse
A3 : Cow,dog,cat

Would return:

Sheet2

A1 : Horse
A2 : Cow
A3 : Pig
A4 : Cat
A5 : Dog

Any VBA miracle minds attempting this one, will have my deepest
gratitude.


thanks,
  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 27,285
Default Filtering Column Duplicates

Modification of this code from John Walkenback's site. Assumes xl2000 or
later

http://j-walk.com/ss/excel/tips/tip47.htm

Sub RemoveDuplicates()
Dim AllCells As Range, Cell As Range
Dim NoDupes As New Collection
Dim i As Integer, v As Variant

' The items are in A1:A105
Set AllCells = Range("A1:A105")

' The next statement ignores the error caused
' by attempting to add a duplicate key to the collection.
' The duplicate is not added - which is just what we want!
On Error Resume Next
For Each Cell In AllCells
v = Split(cell,",")
for i = lbound(v) to ubound(v)
v(i) = Replace(v(i),".","")
v(i) = Replace(v(i),"?","")
v(i) = Replace(v(i),"!","")
NoDupes.Add v(i), CStr(v(i))
' Note: the 2nd argument (key) for the Add method must be a string
Next i
Next Cell

' Resume normal error handling
On Error GoTo 0
redim v(1 to NoDupes.count, 1 to 1)
' Sort the collection (optional)
For i = 1 To NoDupes.Count - 1
v(i,1) = NoDupes(i)
Next i

With Range("J1").Resize(Nodupes.count,1)
.Value = v
.sort Key1:=Range("J1"), Order1:=xlAscending, _
header:=xlNo
end With
End Sub

Code has not been tested and may contain typos.

--
Regards,
Tom Ogilvy


"Craig Freeman" wrote in message
om...
Good day,

I'm attempting to extract duplicate comma-delimited text strings (cell
character length greater that 255 - limitation of countif) found in a
column and return those results in a separate worksheet and in
separate rows for each duplicate found in the original column.

For example: (keep in mind, this example is less than 255 characters
per cell, but I need this to work for cell over 255 characters)

Sheet1

A1 : Horse,cow,pig,
A2 : Pig,cat,horse
A3 : Cow,dog,cat

Would return:

Sheet2

A1 : Horse
A2 : Cow
A3 : Pig
A4 : Cat
A5 : Dog

Any VBA miracle minds attempting this one, will have my deepest
gratitude.


thanks,



  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 2,253
Default Filtering Column Duplicates


My solution is very similar to Tom's post.

however I've used a Dictionary object (from Scripting Runtime) rather
then a Collection. I find it more versatile and in many cases it
outperforms the collection, certainly when used WITH a reference to the
library and Dim'ed as dictionary



Here goes:
Option Explicit

Const DELIM = ","

Sub Parser()
Dim arr As Variant
Dim itm As Variant
Dim rSrc As Range
Dim rCel As Range
Dim oDic As Object

'We're using an unbound Scripting Dictionary object
'Preferably add reference to Microsoft Scripting Runtime and use
'Dim oDic As Dictionary
'Set oDic = New Dictionary

Set oDic = CreateObject("Scripting.Dictionary")

'Dictionary s/b case INsensitive
oDic.CompareMode = TextCompare

'Set source (Limited to usedrange)
Set rSrc = Worksheets(1).Columns(1)
Set rSrc = Intersect(rSrc, rSrc.Parent.UsedRange)

'Ignore errors (thrown on duplicate entries)
On Error Resume Next

For Each rCel In rSrc.Cells
With rCel
If Not IsNumeric(.Value) Then
arr = Split(.Value, DELIM)
For Each itm In arr
oDic.Add itm, itm
Next
End If
End With
Next

oDic.Remove vbNullString
On Error GoTo 0

With oDic
ReDim arr(1 To .Count, 1 To 1)
Dim n As Long
itm = .Items
For n = 1 To .Count - 1
arr(n, 1) = itm(n - 1)
Next
End With

With Worksheets(2)
.UsedRange.ClearContents
.Cells(1, 1).Resize(oDic.Count, 1).Value = arr
.Cells(1, 1).Sort .Cells(1), xlAscending
End With

End Sub


--
keepITcool
| www.XLsupport.com | keepITcool chello nl | amsterdam


Craig Freeman wrote :

Good day,

I'm attempting to extract duplicate comma-delimited text strings (cell
character length greater that 255 - limitation of countif) found in a
column and return those results in a separate worksheet and in
separate rows for each duplicate found in the original column.

For example: (keep in mind, this example is less than 255 characters
per cell, but I need this to work for cell over 255 characters)

Sheet1

A1 : Horse,cow,pig,
A2 : Pig,cat,horse
A3 : Cow,dog,cat

Would return:

Sheet2

A1 : Horse
A2 : Cow
A3 : Pig
A4 : Cat
A5 : Dog

Any VBA miracle minds attempting this one, will have my deepest
gratitude.


thanks,

  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 17
Default Filtering Column Duplicates

Hi Tom,

Thanks for your help. Your modified code works well...but there seems
to be a couple of anomalies with its results. The first, is that for
some reason the code seems to ignores the last string (integer) in the
last row of the range...? Not sure why. The second anomaly is a
little harder to explain, so I'll give the scenario and the results in
the following examples:

Example 1:

cow,pig,dog
cow,pig,dog

returns:

cow
pig

Example 2:

cow,pig,dog
cow,pig,dog
dog

returns:

cow
pig

Example 3:

cat,dog
cow,pig,dog
cow,pig,dog

returns:

cat
cow
dog

I could put a few more examples of different combinations and the
results, but I don't want to confuse it too much. God knows I've
been scratching my head to find a commonality between the results.
Any ides..?

Craig Freeman

  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 17
Default Filtering Column Duplicates

Hello keepITcool,

Thanks for taking the time to reply to my question.

I'm not sure if it's me, or some problem with the code, but I'm
gettting a 'compile error - variable not defined' with line 25:
'oDic.CompareMode = TextCompare'. Is there something that I need to do
to make this work...possible the something with 'Scripting
Dictionary'..? Any ideas?

thanks again,
Craig Freeman



  #6   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 2,253
Default Filtering Column Duplicates

Craig.

Yep

I made my code "latebound" but easily convertable to "earlybound",
however I forgot to change the constant to it's corresponding value.

You'll have to change the constant TextCompare to 1,
(or 0 for BinaryCompare = case sensitive).

OR

You'll have better performance with the early bound variant,
by adding the reference, and swapping the commented lines.
for DIM and SET


--
keepITcool
| www.XLsupport.com | keepITcool chello nl | amsterdam


Craig Freeman wrote :

Hello keepITcool,

Thanks for taking the time to reply to my question.

I'm not sure if it's me, or some problem with the code, but I'm
gettting a 'compile error - variable not defined' with line 25:
'oDic.CompareMode = TextCompare'. Is there something that I need to
do to make this work...possible the something with 'Scripting
Dictionary'..? Any ideas?

thanks again,
Craig Freeman

  #7   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 17
Default Filtering Column Duplicates

keepITcool,

Cool...that did the trick.

Interestingly enough, the same anomalies that I commented on with Tom's
modified code, also appeared with your suggestion..? I'd like to
understand why this is..? The workaround seems to be adding an
erroneous row of data at the end of the worksheet.

You'll have better performance with the early bound variant,
by adding the reference, and swapping the commented lines.
for DIM and SET


Sorry, I'm very new to this game and still learning. I've read that
'Late-bound' means that the interface is not known at compile time, and
that 'Early-bound' is known at compile time - and as you stated, this
increases performance. Kinda of like stating a fact before the
question is asked. What I'm not sure of, is the adding the reference
part and which commented lines for DIM and SET to switch. Apologies if
this is obvious.

thanks again,
Craig Freeman

  #8   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 2,253
Default Filtering Column Duplicates

Craig,

Anomaly is in fact a programming error, causing the last item added to
the dictionary to be discarded in the writeback to the array.

line For n = 1 To .Count - 1
s/b For n = 1 To .Count

Re latebound:
good to see you've done a bit of homework, and understood.
I'll rephrase what I thought obvious, but apparently wasn't...

add reference to Microsoft Scripting Runtime

line dim oDic as Object
s/b dim oDic as Dictionary

line set oDic = CreateObject(etc)
s/b set oDic = new dictionary


ofcourse changing the 0 (back) to TextCompare is not needed,
but may make the code easier to read.

HTH

--
keepITcool
| www.XLsupport.com | keepITcool chello nl | amsterdam


Craig Freeman wrote :

keepITcool,

Cool...that did the trick.

Interestingly enough, the same anomalies that I commented on with
Tom's modified code, also appeared with your suggestion..? I'd like
to understand why this is..? The workaround seems to be adding an
erroneous row of data at the end of the worksheet.

You'll have better performance with the early bound variant,
by adding the reference, and swapping the commented lines.
for DIM and SET


Sorry, I'm very new to this game and still learning. I've read that
'Late-bound' means that the interface is not known at compile time,
and that 'Early-bound' is known at compile time - and as you stated,
this increases performance. Kinda of like stating a fact before the
question is asked. What I'm not sure of, is the adding the reference
part and which commented lines for DIM and SET to switch. Apologies
if this is obvious.

thanks again,
Craig Freeman

  #9   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 17
Default Filtering Column Duplicates

keepITcool,

That did it...in fact it fixed it for both solutions. You've been a
big help. Thank you again!

Craig Freeman

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Counting Cells in a column depending on another column but excluding duplicates rwenger Excel Discussion (Misc queries) 0 April 11th 10 01:06 PM
Filtering out duplicates while ignoring 1 row Allie Excel Discussion (Misc queries) 0 September 17th 08 03:14 PM
Filtering Duplicates Shu of AZ Excel Discussion (Misc queries) 11 December 31st 06 06:29 AM
Filtering a list with duplicates sharpie Excel Worksheet Functions 3 October 6th 05 01:32 AM
trouble filtering a list. Why isn't column filtering? Pat Excel Worksheet Functions 1 July 18th 05 03:30 PM


All times are GMT +1. The time now is 01:01 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"