Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Filtering Column Duplicates
Good day,
I'm attempting to extract duplicate comma-delimited text strings (cell character length greater that 255 - limitation of countif) found in a column and return those results in a separate worksheet and in separate rows for each duplicate found in the original column. For example: (keep in mind, this example is less than 255 characters per cell, but I need this to work for cell over 255 characters) Sheet1 A1 : Horse,cow,pig, A2 : Pig,cat,horse A3 : Cow,dog,cat Would return: Sheet2 A1 : Horse A2 : Cow A3 : Pig A4 : Cat A5 : Dog Any VBA miracle minds attempting this one, will have my deepest gratitude. thanks, |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Filtering Column Duplicates
Modification of this code from John Walkenback's site. Assumes xl2000 or
later http://j-walk.com/ss/excel/tips/tip47.htm Sub RemoveDuplicates() Dim AllCells As Range, Cell As Range Dim NoDupes As New Collection Dim i As Integer, v As Variant ' The items are in A1:A105 Set AllCells = Range("A1:A105") ' The next statement ignores the error caused ' by attempting to add a duplicate key to the collection. ' The duplicate is not added - which is just what we want! On Error Resume Next For Each Cell In AllCells v = Split(cell,",") for i = lbound(v) to ubound(v) v(i) = Replace(v(i),".","") v(i) = Replace(v(i),"?","") v(i) = Replace(v(i),"!","") NoDupes.Add v(i), CStr(v(i)) ' Note: the 2nd argument (key) for the Add method must be a string Next i Next Cell ' Resume normal error handling On Error GoTo 0 redim v(1 to NoDupes.count, 1 to 1) ' Sort the collection (optional) For i = 1 To NoDupes.Count - 1 v(i,1) = NoDupes(i) Next i With Range("J1").Resize(Nodupes.count,1) .Value = v .sort Key1:=Range("J1"), Order1:=xlAscending, _ header:=xlNo end With End Sub Code has not been tested and may contain typos. -- Regards, Tom Ogilvy "Craig Freeman" wrote in message om... Good day, I'm attempting to extract duplicate comma-delimited text strings (cell character length greater that 255 - limitation of countif) found in a column and return those results in a separate worksheet and in separate rows for each duplicate found in the original column. For example: (keep in mind, this example is less than 255 characters per cell, but I need this to work for cell over 255 characters) Sheet1 A1 : Horse,cow,pig, A2 : Pig,cat,horse A3 : Cow,dog,cat Would return: Sheet2 A1 : Horse A2 : Cow A3 : Pig A4 : Cat A5 : Dog Any VBA miracle minds attempting this one, will have my deepest gratitude. thanks, |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Filtering Column Duplicates
My solution is very similar to Tom's post. however I've used a Dictionary object (from Scripting Runtime) rather then a Collection. I find it more versatile and in many cases it outperforms the collection, certainly when used WITH a reference to the library and Dim'ed as dictionary Here goes: Option Explicit Const DELIM = "," Sub Parser() Dim arr As Variant Dim itm As Variant Dim rSrc As Range Dim rCel As Range Dim oDic As Object 'We're using an unbound Scripting Dictionary object 'Preferably add reference to Microsoft Scripting Runtime and use 'Dim oDic As Dictionary 'Set oDic = New Dictionary Set oDic = CreateObject("Scripting.Dictionary") 'Dictionary s/b case INsensitive oDic.CompareMode = TextCompare 'Set source (Limited to usedrange) Set rSrc = Worksheets(1).Columns(1) Set rSrc = Intersect(rSrc, rSrc.Parent.UsedRange) 'Ignore errors (thrown on duplicate entries) On Error Resume Next For Each rCel In rSrc.Cells With rCel If Not IsNumeric(.Value) Then arr = Split(.Value, DELIM) For Each itm In arr oDic.Add itm, itm Next End If End With Next oDic.Remove vbNullString On Error GoTo 0 With oDic ReDim arr(1 To .Count, 1 To 1) Dim n As Long itm = .Items For n = 1 To .Count - 1 arr(n, 1) = itm(n - 1) Next End With With Worksheets(2) .UsedRange.ClearContents .Cells(1, 1).Resize(oDic.Count, 1).Value = arr .Cells(1, 1).Sort .Cells(1), xlAscending End With End Sub -- keepITcool | www.XLsupport.com | keepITcool chello nl | amsterdam Craig Freeman wrote : Good day, I'm attempting to extract duplicate comma-delimited text strings (cell character length greater that 255 - limitation of countif) found in a column and return those results in a separate worksheet and in separate rows for each duplicate found in the original column. For example: (keep in mind, this example is less than 255 characters per cell, but I need this to work for cell over 255 characters) Sheet1 A1 : Horse,cow,pig, A2 : Pig,cat,horse A3 : Cow,dog,cat Would return: Sheet2 A1 : Horse A2 : Cow A3 : Pig A4 : Cat A5 : Dog Any VBA miracle minds attempting this one, will have my deepest gratitude. thanks, |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
Filtering Column Duplicates
Hi Tom,
Thanks for your help. Your modified code works well...but there seems to be a couple of anomalies with its results. The first, is that for some reason the code seems to ignores the last string (integer) in the last row of the range...? Not sure why. The second anomaly is a little harder to explain, so I'll give the scenario and the results in the following examples: Example 1: cow,pig,dog cow,pig,dog returns: cow pig Example 2: cow,pig,dog cow,pig,dog dog returns: cow pig Example 3: cat,dog cow,pig,dog cow,pig,dog returns: cat cow dog I could put a few more examples of different combinations and the results, but I don't want to confuse it too much. God knows I've been scratching my head to find a commonality between the results. Any ides..? Craig Freeman |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
Filtering Column Duplicates
Hello keepITcool,
Thanks for taking the time to reply to my question. I'm not sure if it's me, or some problem with the code, but I'm gettting a 'compile error - variable not defined' with line 25: 'oDic.CompareMode = TextCompare'. Is there something that I need to do to make this work...possible the something with 'Scripting Dictionary'..? Any ideas? thanks again, Craig Freeman |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
Filtering Column Duplicates
Craig.
Yep I made my code "latebound" but easily convertable to "earlybound", however I forgot to change the constant to it's corresponding value. You'll have to change the constant TextCompare to 1, (or 0 for BinaryCompare = case sensitive). OR You'll have better performance with the early bound variant, by adding the reference, and swapping the commented lines. for DIM and SET -- keepITcool | www.XLsupport.com | keepITcool chello nl | amsterdam Craig Freeman wrote : Hello keepITcool, Thanks for taking the time to reply to my question. I'm not sure if it's me, or some problem with the code, but I'm gettting a 'compile error - variable not defined' with line 25: 'oDic.CompareMode = TextCompare'. Is there something that I need to do to make this work...possible the something with 'Scripting Dictionary'..? Any ideas? thanks again, Craig Freeman |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
Filtering Column Duplicates
keepITcool,
Cool...that did the trick. Interestingly enough, the same anomalies that I commented on with Tom's modified code, also appeared with your suggestion..? I'd like to understand why this is..? The workaround seems to be adding an erroneous row of data at the end of the worksheet. You'll have better performance with the early bound variant, by adding the reference, and swapping the commented lines. for DIM and SET Sorry, I'm very new to this game and still learning. I've read that 'Late-bound' means that the interface is not known at compile time, and that 'Early-bound' is known at compile time - and as you stated, this increases performance. Kinda of like stating a fact before the question is asked. What I'm not sure of, is the adding the reference part and which commented lines for DIM and SET to switch. Apologies if this is obvious. thanks again, Craig Freeman |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
Filtering Column Duplicates
Craig,
Anomaly is in fact a programming error, causing the last item added to the dictionary to be discarded in the writeback to the array. line For n = 1 To .Count - 1 s/b For n = 1 To .Count Re latebound: good to see you've done a bit of homework, and understood. I'll rephrase what I thought obvious, but apparently wasn't... add reference to Microsoft Scripting Runtime line dim oDic as Object s/b dim oDic as Dictionary line set oDic = CreateObject(etc) s/b set oDic = new dictionary ofcourse changing the 0 (back) to TextCompare is not needed, but may make the code easier to read. HTH -- keepITcool | www.XLsupport.com | keepITcool chello nl | amsterdam Craig Freeman wrote : keepITcool, Cool...that did the trick. Interestingly enough, the same anomalies that I commented on with Tom's modified code, also appeared with your suggestion..? I'd like to understand why this is..? The workaround seems to be adding an erroneous row of data at the end of the worksheet. You'll have better performance with the early bound variant, by adding the reference, and swapping the commented lines. for DIM and SET Sorry, I'm very new to this game and still learning. I've read that 'Late-bound' means that the interface is not known at compile time, and that 'Early-bound' is known at compile time - and as you stated, this increases performance. Kinda of like stating a fact before the question is asked. What I'm not sure of, is the adding the reference part and which commented lines for DIM and SET to switch. Apologies if this is obvious. thanks again, Craig Freeman |
#9
Posted to microsoft.public.excel.programming
|
|||
|
|||
Filtering Column Duplicates
keepITcool,
That did it...in fact it fixed it for both solutions. You've been a big help. Thank you again! Craig Freeman |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Counting Cells in a column depending on another column but excluding duplicates | Excel Discussion (Misc queries) | |||
Filtering out duplicates while ignoring 1 row | Excel Discussion (Misc queries) | |||
Filtering Duplicates | Excel Discussion (Misc queries) | |||
Filtering a list with duplicates | Excel Worksheet Functions | |||
trouble filtering a list. Why isn't column filtering? | Excel Worksheet Functions |