ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Find Combinations (https://www.excelbanter.com/excel-programming/450964-find-combinations.html)

Bruno Campanini[_2_]

Find Combinations
 
I have a list of integers (myrange)

ROW Numbers
1 1,2,3,8,10,14
2 21,26,34,45,60,66
3 8,18,19,23,45,46
4 21,36,38,41,48,49
n .................
Each row is ordered LeftToRight, ASC; n is in the order of 7000.

I have 6 numbers (ordered the same way)
21,26,34,45,48,60 (myrow)

I write all the combinations I can get from
myrow, they are 2^6 = 64 but excluding C6,0 they are 63:
6,1 = 6
6,2 = 15
6,3 = 20
6,4 = 15
6,5 = 6
6,6 = 1

Now I want to see how many times EACH combination out of the 63'
appears in myrange, and in what row.

The solution for the example is:
C6,1 = 8
C6,2 = 11
C6,3 = 10
C6,4 = 5
C6,5 = 1
C6,6 = 0

Somebody tried scanning any row of myrange 63 times but
it is a very much time consuming way (more than 1 hr).
I tried using queries and got a good time of some 3 minutes.

Any good idea using VBA without any support of queries?
(using queries my Excel 2013/64 bit is unstable and
crashes often).

Bruno

scattered[_5_]

Find Combinations
 
On Friday, June 26, 2015 at 8:30:13 PM UTC-4, Bruno Campanini wrote:
I have a list of integers (myrange)

ROW Numbers
1 1,2,3,8,10,14
2 21,26,34,45,60,66
3 8,18,19,23,45,46
4 21,36,38,41,48,49
n .................
Each row is ordered LeftToRight, ASC; n is in the order of 7000.

I have 6 numbers (ordered the same way)
21,26,34,45,48,60 (myrow)

I write all the combinations I can get from
myrow, they are 2^6 = 64 but excluding C6,0 they are 63:
6,1 = 6
6,2 = 15
6,3 = 20
6,4 = 15
6,5 = 6
6,6 = 1

Now I want to see how many times EACH combination out of the 63'
appears in myrange, and in what row.

The solution for the example is:
C6,1 = 8
C6,2 = 11
C6,3 = 10
C6,4 = 5
C6,5 = 1
C6,6 = 0

Somebody tried scanning any row of myrange 63 times but
it is a very much time consuming way (more than 1 hr).
I tried using queries and got a good time of some 3 minutes.

Any good idea using VBA without any support of queries?
(using queries my Excel 2013/64 bit is unstable and
crashes often).

Bruno


What does it mean for a combination to appear in myrange? You never defined it. I suspect you mean that it has to appear as consecutive items in some row -- though you never actually said that. In any event -- dictionaries are clearly the way to go. Include a reference to Microsoft Scripting Runtime in your project. For each nonempty combination of myrow, construct a key which looks like e.g. "26-45-48" and add it to the dictionary (with say "" as the value -- it doesn't matter). Then loop through your myrange row by row (after transferring it into a VBA array in one step to avoid the overhead of repeated use of things like range() or cells()). For each row, used nested for loops to construct all 21 possible keys corresponding to that row. When you construct a key -- check if it is in the dictionary. If it is - increment a counter. 21*7000 is only 147000. That many dictionary accesses should only take a few seconds. There is no reason that the sort of thing you describe should take any more than 10 seconds or so.

Bruno Campanini[_2_]

Find Combinations
 
scattered formulated the question :

What does it mean for a combination to appear in myrange?
You never defined it.

Ok, I should say
"EACH combination out of the 63'
appears in one or more rows in myrange, and in what row."


I suspect you mean that it has to appear as consecutive items in some row
-- though you never actually said that.

Not only consecutive: 1,2,3,x,y,z is as good as 1,x,y,2,3,z
or x,y,1,z,2,3 etc.
Let me try to explain with shorter rows (4 numbers instead of 6).

If myrow is 12,24,25,46 there are 15 combinations:
C4,1 12,24,25,46
C4,2 12-24, 12-25, 12-46
24-25, 24-46, 25-46
C4,3 12-24-25, 12-24-46
12-25-46, 24-25-46
C4,4 12,24,25,46

If there is in myrange a row like:
12,25,46,47
it will contain
3 combinations from C4,1: 12,25,46
4 combinations from C4,2: 12-25, 12-46, 24-25, 25-46
1 combination from C4,3: 12-25-46
0 from C4,4

Each row of myrange must be searched 69 times:
4 * 4 = 16 times for C4,1
6 * 6 = 36 times for C4,2
4 * 6 = 16 times for C4,3
1 time for C4,4

Clear enough? I hope so.
Back to 6-element rows.

I have 63 combinations out of myrow.
Each row in myrange must be searched
(C6,1)^2 + (C6,2)^2 + (C6,3)^2 + (C6,4)^2 + (C6,5)^2 + (C6,6)^2
= 923 times.

What is not clear to me is your suggestion to use the Dictionary
(or Coolections).
Do you suggest to put into Dictionary the 63 combinations from
myrow and looking for them into each row of myrange?
or what else?

Bruno

scattered[_5_]

Find Combinations
 
On Sunday, June 28, 2015 at 7:56:18 PM UTC-4, Bruno Campanini wrote:
scattered formulated the question :

What does it mean for a combination to appear in myrange?
You never defined it.

Ok, I should say
"EACH combination out of the 63'
appears in one or more rows in myrange, and in what row."


I suspect you mean that it has to appear as consecutive items in some row
-- though you never actually said that.

Not only consecutive: 1,2,3,x,y,z is as good as 1,x,y,2,3,z
or x,y,1,z,2,3 etc.
Let me try to explain with shorter rows (4 numbers instead of 6).

If myrow is 12,24,25,46 there are 15 combinations:
C4,1 12,24,25,46
C4,2 12-24, 12-25, 12-46
24-25, 24-46, 25-46
C4,3 12-24-25, 12-24-46
12-25-46, 24-25-46
C4,4 12,24,25,46

If there is in myrange a row like:
12,25,46,47
it will contain
3 combinations from C4,1: 12,25,46
4 combinations from C4,2: 12-25, 12-46, 24-25, 25-46
1 combination from C4,3: 12-25-46
0 from C4,4

Each row of myrange must be searched 69 times:
4 * 4 = 16 times for C4,1
6 * 6 = 36 times for C4,2
4 * 6 = 16 times for C4,3
1 time for C4,4

Clear enough? I hope so.
Back to 6-element rows.

I have 63 combinations out of myrow.
Each row in myrange must be searched
(C6,1)^2 + (C6,2)^2 + (C6,3)^2 + (C6,4)^2 + (C6,5)^2 + (C6,6)^2
= 923 times.

What is not clear to me is your suggestion to use the Dictionary
(or Coolections).
Do you suggest to put into Dictionary the 63 combinations from
myrow and looking for them into each row of myrange?
or what else?

Bruno


So it sounds like for each row in myrange you want to find the size of the intersection between the elements in that row and the elements in myrow, and to tabulate these sizes (how many times it is 0 -- although you didn't mention that explicitly, how many times it is 1, etc.) Is that an accurate description of what you want?

Bruno Campanini[_2_]

Find Combinations
 
scattered explained on 29-06-15 :

So it sounds like for each row in myrange you want to find the size of the
intersection between the elements in that row and the elements in myrow, and
to tabulate these sizes (how many times it is 0 -- although you didn't
mention that explicitly, how many times it is 1, etc.) Is that an accurate
description of what you want?


There is a problem in our communications:
either I'm unable to understand what I mean or
you are unable to understand what I say.
Tertium non datur.

Let me try, for the last time, to QBE (query by example),
taking into account a further semplification of the
previous example.

myrow = 12,24,25,46 whose C4,2 combinations a
1) 12-24 2) 12-25 3) 12-46
4) 24-25 5) 24-46 6) 25-46

in myrange there is this row:
12,25,41,46

Are you able to tell me (by what code or algorithm)
that that very row of myrange contains THREE of the
previously mentioned combinations, that is:
2) 12-25
3) 12-46
6) 25-46
???

Bruno

Bruno Campanini[_2_]

Find Combinations
 
scattered has brought this to us :

Hope that helps
-scattered

Ok, very nice piece of code! Thank you very much.
If works perfectly in less than 1/3 sec and - very important thing -
it is not necessary to order both myrow and myrange LeftToRight ASC.

May I make a comment...
Why:
ReDim matches(1 To n)
For i = 1 To n
matches(i) = 0
Next i

ReDim matches(1 To n)
OR
ReDim matches(1 To n) As Integer | Long
is enough... isn't?

Bruno

scattered[_5_]

Find Combinations
 
On Monday, June 29, 2015 at 8:33:46 PM UTC-4, Bruno Campanini wrote:
scattered has brought this to us :

Hope that helps
-scattered

Ok, very nice piece of code! Thank you very much.
If works perfectly in less than 1/3 sec and - very important thing -
it is not necessary to order both myrow and myrange LeftToRight ASC.

May I make a comment...
Why:
ReDim matches(1 To n)
For i = 1 To n
matches(i) = 0
Next i

ReDim matches(1 To n)
OR
ReDim matches(1 To n) As Integer | Long
is enough... isn't?

Bruno


Glad you could use it. I had fun writing it.

I have a strong preference for using variants to pass arrays to or from
functions/subs. For one thing -- it is somewhat idiomatic if you consider
the way built-in functions like Split() and Join() accept or return variant
arrays. For another thing -- it is easier to use variant arrays to transfer
data between VBA and the spreadsheet. In my code, I initially didn't include
that initialization loop for matches. But when I ran it matches(6) was Empty
rather than 0, and the way I was building up the report string in the test code
caused this empty value to be converted into the empty string rather than to 0.
This caused the last line to look like

Combos of size 6 =

with nothing to the right of the equals sign. Hence the explicit initialization
to 0.

-scattered

Bruno Campanini[_2_]

Find Combinations
 
scattered laid this down on his screen :

Glad you could use it. I had fun writing it.

Well, I'll have much to do in adapting your code to my needs...
For each matches() I need to get the row of myrange where it appairs.
I think it will take a lot of time compared with simply get how
many matches() in the full myrange.
I'll keep you informed.

I have a strong preference for using variants to pass arrays to or from
functions/subs.

I normally do the very same thing.

For one thing -- it is somewhat idiomatic if you consider
the way built-in functions like Split() and Join() accept or return variant
arrays. For another thing -- it is easier to use variant arrays to transfer
data between VBA and the spreadsheet. In my code, I initially didn't include
that initialization loop for matches. But when I ran it matches(6) was Empty
rather than 0, and the way I was building up the report string in the test
code caused this empty value to be converted into the empty string rather
than to 0. This caused the last line to look like

Combos of size 6 =

with nothing to the right of the equals sign. Hence the explicit
initialization to 0.

If you simply state
ReDim matches(1 To n)
you get matches() "full" of empty string

but if you state
ReDim matches(1 To n) As Integer or As Long or As Currency
you get matches() "full" of 0

Ciao
Bruno

PS
I'm from Italy.
Where are you from?

scattered[_5_]

Find Combinations
 
On Tuesday, June 30, 2015 at 6:20:07 AM UTC-4, Bruno Campanini wrote:
scattered laid this down on his screen :

Glad you could use it. I had fun writing it.

Well, I'll have much to do in adapting your code to my needs...
For each matches() I need to get the row of myrange where it appairs.
I think it will take a lot of time compared with simply get how
many matches() in the full myrange.
I'll keep you informed.

I have a strong preference for using variants to pass arrays to or from
functions/subs.

I normally do the very same thing.

For one thing -- it is somewhat idiomatic if you consider
the way built-in functions like Split() and Join() accept or return variant
arrays. For another thing -- it is easier to use variant arrays to transfer
data between VBA and the spreadsheet. In my code, I initially didn't include
that initialization loop for matches. But when I ran it matches(6) was Empty
rather than 0, and the way I was building up the report string in the test
code caused this empty value to be converted into the empty string rather
than to 0. This caused the last line to look like

Combos of size 6 =

with nothing to the right of the equals sign. Hence the explicit
initialization to 0.

If you simply state
ReDim matches(1 To n)
you get matches() "full" of empty string

but if you state
ReDim matches(1 To n) As Integer or As Long or As Currency
you get matches() "full" of 0

Ciao
Bruno

PS
I'm from Italy.
Where are you from?


On Tuesday, June 30, 2015 at 6:20:07 AM UTC-4, Bruno Campanini wrote:
(snip)


Out of curiosity, what is the motivation for this problem?

I have some ideas about how to modify the code so that it will keep track of the rows where each combination appears (and still run in a couple of seconds), but the resulting code would be too long to easily post. Perhaps we can continue this conversation via e-mail. Feel free to send me one (prefix "re" to my nickname and add the usual gmail suffix)

I didn't realize you could specify the variant subtype when redimensioning the variable. That is useful to know. Thanks.

I am from the US, near Pittsburgh. Professionally I do math and computer science (mostly math) which is why your post raised my interest.

Bruno Campanini[_2_]

Find Combinations
 
scattered formulated on Tuesday :

Out of curiosity, what is the motivation for this problem?

I don't really know, it is not a problem of mine.
The OP put the question from an italian Excel NG and its mathematical
aspect captured my interest in the very same way it happened to you.

I have some ideas about how to modify the code so that it will keep track of
the rows where each combination appears (and still run in a couple of
seconds), but the resulting code would be too long to easily post. Perhaps we
can continue this conversation via e-mail. Feel free to send me one (prefix
"re" to my nickname and add the usual gmail suffix)

Ok, I'll add some info to my worksheet containing the full solution
and I'll send you it by mail.

I didn't realize you could specify the variant subtype when redimensioning
the variable. That is useful to know. Thanks.

I am from the US, near Pittsburgh. Professionally I do math and computer
science (mostly math) which is why your post raised my interest.

I am living in Forlė, central-northen Italy, near Bologna, the oldest
(1088 AD) university in the western world.
I'm retired (nearly 80).

Bruno


All times are GMT +1. The time now is 07:34 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
ExcelBanter.com