Home |
Search |
Today's Posts |
#1
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
I wrote this as an intellectual exercise after reading Dick Kusleika's blog
post on the same subject at http://www.dicks-blog.com/archives/2...s-from-arrays/ and thought I'd share it here. Dick's code is very straightforward and with his to-the-point comments easy to understand. The code below provides a different take on the subject addressing issues like: The requirement of balancing the returned arrays is improved through the use of the 'Rebalance' argument. If true, the code runs a little slower but returns better balanced arrays. The new subroutine is easier to use, IMO, because it uses ParamArrays. It *replaces* the contents of the original arrays. That may be good or bad depending on one's intent. The new code lifts the requirement that every element of every array be a variant (because of the "blanking the duplicate element" bit in the original code). The new code doesn't rely on a predetermined lower bound. In the VB6/VB.Net/legacy XL/VBA environment there is no way to force every array to have a particular lower bound. Watch out for line wraps. Sorry about that but trying to fix them would make reading the code even more of a chore. Option Explicit Public Type LenData 'Contains paramarray index and len of array in that position Idx As Long ArrLen As Long End Type Sub StartArrays() Dim vaOne As Variant Dim vaTwo As Variant Dim vaThree As Variant Dim vaMain As Variant 'Set up some secondary arrays vaOne = Array(1, 2, 3, 4, 5) vaTwo = Array(5, 6, 7) vaThree = Array(1, 5, 8, 9) vaOne = Array(5, 4, 3, 7, 1, 1) vaTwo = Array(5, 5, 6, 7, 5) vaThree = Array(8, 9, 2, 1, 2) Dim vaFour As Variant, I As Integer ReDim vaFour(1 To 10) As Integer For I = LBound(vaFour) To UBound(vaFour): vaFour(I) = I: Next I RemoveDuplicates True, vaOne, vaTwo, vaThree, vaFour ShowResults vaOne, vaTwo, vaThree, vaFour End Sub Function ArrLen(anArr, Optional aDim As Integer = 1) On Error Resume Next ArrLen = UBound(anArr, aDim) - LBound(anArr, aDim) + 1 End Function Function DynArrLen(anArr, LastElement As Long, Optional aDim As Integer = 1) On Error Resume Next DynArrLen = LastElement - LBound(anArr, aDim) + 1 End Function Function sortIndices(vaMain As Variant, _ ValidLastElement() As Long) As LenData() Dim Rslt() As LenData, I As Long, J As Long, _ Temp As LenData ReDim Rslt(LBound(vaMain) To UBound(vaMain)) For I = LBound(vaMain) To UBound(vaMain) Rslt(I).Idx = I Rslt(I).ArrLen = DynArrLen(vaMain(I), ValidLastElement(I)) Next I 'Bubble sort the array containing the lengths of the _ original arrays For I = LBound(Rslt) To UBound(Rslt) - 1 For J = I + 1 To UBound(Rslt) If Rslt(I).ArrLen Rslt(J).ArrLen Then Temp = Rslt(I) Rslt(I) = Rslt(J) Rslt(J) = Temp End If Next J Next I sortIndices = Rslt End Function Sub removeElement(ByRef origArr, ByRef ValidLastElement As Long, _ Idx As Long) origArr(Idx) = origArr(ValidLastElement) ValidLastElement = ValidLastElement - 1 End Sub Sub shrinkArray(ByRef anArr, ByVal NewLastElementIdx) If NewLastElementIdx < LBound(anArr) Then Erase anArr Else ReDim Preserve anArr(LBound(anArr) To NewLastElementIdx) End If Exit Sub End Sub Sub RemoveDuplicates(attemptRebalance As Boolean, ParamArray InArr()) 'Returns the arrays with duplicates removed. One instance of _ every duplicate will be returned, though which array it will be _ in is not predetermined. 'WARNING: Every argument to this procedure must be a variant _ ======= which contains an array, i.e., of the nature _ Dim x:redim x(..) w/ or w/o an explicit type or _ Dim x:x=array(...) It cannot be something declared as an array, _ i.e., it cannot be Dim({bounds}) or Dim x() w/ or w/o an _ explicit type. If one of the arguments is an array (as opposed _ to a variant that contains an array) the results will be _ catastrophic. The returned data will be corrupt and a 2nd call _ on this routine will corrupt various memory structures. Excel _ will crash and ask to talk to mama. Dim vaMain() vaMain = InArr 'vaMain is used solely because I couldn't figure out how to pass _ Inarr to a procedure such as sortIndices. It appears that there _ are several undocumented but compiler-enforced restrictions on a _ paramarray's use. If Not IsArray(vaMain) Then Exit Sub Dim I As Long For I = LBound(vaMain) To UBound(vaMain) If Not IsArray(vaMain(I)) Then Exit Sub Next I 'Here's how we 'remove' duplicates. Use 2 helper arrays. The _ first, SortedIndices maintains a list of the indices of the _ actual arrays based on their respective array sizes. Hence _ the array pointed to by SortedIndices(0) is the smallest array. _ The 2nd helper array is the ValidLastElement. It has one entry _ for each actual array. This entry is the index of the last _ valid element in that array. It's how we 'remove' duplicates _ (see below). 'The overall algorithm is as follows: _ Go through each array starting with the smallest array - _ indicated by the first element in sortedIndices doing the _ following: _ For every element in this 'master' array -- upto the _ last-valid-element pointer: _ Search every array including the master array and _ 'remove' the element, if found, from the array being searched. _ An element is 'removed' using the foll. technique: _ Replace that element with the current _ last-valid-element and decrement the pointer by 1. _ When, the master array is completely analyzed, just do a redim _ preserve and the dups will be gone from that array. _ This technique should dramatically speed up the procedure. A _ secondary speed benefit will be that when we are going through an _ array we only have to go up to the current ValidLastElement. Dim ValidLastElement() As Long ReDim ValidLastElement(LBound(vaMain) To UBound(vaMain)) Dim J As Long, K As Long, KK As Long For I = LBound(ValidLastElement) To UBound(ValidLastElement) ValidLastElement(I) = UBound(vaMain(I)) Next I DoRebalance: Dim sortedIndices() As LenData sortedIndices = sortIndices(vaMain, ValidLastElement) For I = LBound(sortedIndices) To UBound(sortedIndices) - 1 'the last array does not have to be checked; just _ redimmed correctly. For K = LBound(vaMain(sortedIndices(I).Idx)) _ To ValidLastElement(sortedIndices(I).Idx) 'optimized down from _ UBound(vaMain(sortedIndices(I).Idx)) For J = UBound(sortedIndices) To I Step -1 If J = I Then KK = K + 1 Else KK = LBound(vaMain(sortedIndices(J).Idx)) End If Do While KK <= ValidLastElement(sortedIndices(J).Idx) 'optimized down from UBound(vaMain(sortedIndices (J).Idx)) 'Cannot use a For loop since ValidLastElement(...) _ may/will be modified within the loop If vaMain(sortedIndices(I).Idx)(K) = _ vaMain(sortedIndices(J).Idx)(KK) Then removeElement vaMain(sortedIndices(J).Idx), _ ValidLastElement(sortedIndices(J).Idx), KK If J = LBound(sortedIndices) Then ElseIf attemptRebalance _ And DynArrLen(vaMain(sortedIndices(J).Idx), _ ValidLastElement(sortedIndices(J).Idx)) _ < DynArrLen(vaMain(sortedIndices(J - 1).Idx), _ ValidLastElement(sortedIndices(J - 1).Idx)) Then Debug.Print "Rebalance: I=" & I _ & ", sortedindices(i).idx=" & sortedIndices (I).Idx _ & ", J=" & J _ & ", sortedindices(j).idx=" & sortedIndices (J).Idx _ & ", sortedindices(j-1).idx=" & sortedIndices(J - 1).Idx _ & ",ValidLastElement(sortedIndices(J).idx) =" _ & ValidLastElement(sortedIndices(J).Idx) _ & ",ValidLastElement(sortedIndices(J- 1).idx=" _ & ValidLastElement(sortedIndices(J - 1).Idx) GoTo DoRebalance '<<<<< End If End If KK = KK + 1 Loop Next J Next K shrinkArray vaMain(sortedIndices(I).Idx), _ ValidLastElement(sortedIndices(I).Idx) 'OK to shrink (optimize) because this array will never be changed Next I shrinkArray vaMain(sortedIndices(UBound(sortedIndices)).Idx), _ ValidLastElement(sortedIndices(UBound(sortedIndice s)).Idx) For I = LBound(vaMain) To UBound(vaMain) Erase InArr(I): InArr(I) = vaMain(I) Next I End Sub Sub ShowResults(ParamArray vaMain()) Dim I As Long, J As Long For I = LBound(vaMain) To UBound(vaMain) If ArrLen(vaMain(I)) 0 Then For J = LBound(vaMain(I)) To UBound(vaMain(I)) Debug.Print I, J, vaMain(I)(J) Next J Else Debug.Print I, "No elements" End If Debug.Print "---------------" Next I End Sub -- Regards, Tushar Mehta www.tushar-mehta.com Excel, PowerPoint, and VBA add-ins, tutorials Custom MS Office productivity solutions |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Removing Duplicates from multiple worksheets | Excel Discussion (Misc queries) | |||
Removing Duplicates Help | Excel Discussion (Misc queries) | |||
Removing Duplicates | Excel Discussion (Misc queries) | |||
Removing Duplicates | Excel Worksheet Functions | |||
Removing Duplicates | Excel Programming |