Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
taking average of each set of all possible combinations
I'd like to take the average of all possible combinations of an array
and get the average of each set of permutations. I will specify how many permutions per set. Ultimately I will then look at the the percentage of these averages above or below a certain number. I've got some code below that will list all the permutations, however I have to add them together and then divide by the number of each set. I don't really know VBA, maybe someone can help with this. Also this code puts the results in a new worksheet. I would like this to allways put the results in the same worksheet, say worksheet two after first deleting any existing data. Again thanks in advance. The code is below: Option Explicit Dim vAllItems As Variant Dim Buffer() As String Dim BufferPtr As Long Dim Results As Worksheet ' ' Myrna Larson, July 25, 2000, Microsoft.Public.Excel.Misc Sub ListPermutationsOrCombinations() Dim Rng As Range Dim PopSize As Integer Dim SetSize As Integer Dim Which As String Dim n As Double Const BufferSize As Long = 4096 Worksheets("Sheet1").Range("A1").Select Set Rng = Selection.Columns(1).Cells If Rng.Cells.Count = 1 Then Set Rng = Range(Rng, Rng.End(xlDown)) End If PopSize = Rng.Cells.Count - 2 If PopSize < 2 Then GoTo DataError SetSize = Rng.Cells(2).Value If SetSize PopSize Then GoTo DataError Which = UCase$(Rng.Cells(1).Value) Select Case Which Case "C" n = Application.WorksheetFunction.Combin(PopSize, SetSize) Case "P" n = Application.WorksheetFunction.Permut(PopSize, SetSize) Case Else GoTo DataError End Select If n Cells.Count Then GoTo DataError Application.ScreenUpdating = False Set Results = Worksheets.Add vAllItems = Rng.Offset(2, 0).Resize(PopSize).Value ReDim Buffer(1 To BufferSize) As String BufferPtr = 0 If Which = "C" Then AddCombination PopSize, SetSize Else AddPermutation PopSize, SetSize End If vAllItems = 0 Application.ScreenUpdating = True Exit Sub DataError: If n = 0 Then Which = "Enter your data in a vertical range of at least 4 cells." _ & String$(2, 10) _ & "Top cell must contain the letter C or P, 2nd cell is the Number" _ & "of items in a subset, the cells below are the values from Which" _ & "the subset is to be chosen." Else Which = "This requires " & Format$(n, "#,##0") & _ " cells, more than are available on the worksheet!" End If MsgBox Which, vbOKOnly, "DATA ERROR" Exit Sub End Sub Private Sub AddPermutation(Optional PopSize As Integer = 0, _ Optional SetSize As Integer = 0, _ Optional NextMember As Integer = 0) Static iPopSize As Integer Static iSetSize As Integer Static SetMembers() As Integer Static Used() As Integer Dim i As Integer If PopSize < 0 Then iPopSize = PopSize iSetSize = SetSize ReDim SetMembers(1 To iSetSize) As Integer ReDim Used(1 To iPopSize) As Integer NextMember = 1 End If For i = 1 To iPopSize If Used(i) = 0 Then SetMembers(NextMember) = i If NextMember < iSetSize Then Used(i) = True AddPermutation , , NextMember + 1 Used(i) = False Else SavePermutation SetMembers() End If End If Next i If NextMember = 1 Then SavePermutation SetMembers(), True Erase SetMembers Erase Used End If End Sub 'AddPermutation Private Sub AddCombination(Optional PopSize As Integer = 0, _ Optional SetSize As Integer = 0, _ Optional NextMember As Integer = 0, _ Optional NextItem As Integer = 0) Static iPopSize As Integer Static iSetSize As Integer Static SetMembers() As Integer Dim i As Integer If PopSize < 0 Then iPopSize = PopSize iSetSize = SetSize ReDim SetMembers(1 To iSetSize) As Integer NextMember = 1 NextItem = 1 End If For i = NextItem To iPopSize SetMembers(NextMember) = i If NextMember < iSetSize Then AddCombination , , NextMember + 1, i + 1 Else SavePermutation SetMembers() End If Next i If NextMember = 1 Then SavePermutation SetMembers(), True Erase SetMembers End If End Sub 'AddCombination Private Sub SavePermutation(ItemsChosen() As Integer, _ Optional FlushBuffer As Boolean = False) Dim i As Integer, sValue As String Static RowNum As Long, ColNum As Long If RowNum = 0 Then RowNum = 1 If ColNum = 0 Then ColNum = 1 If FlushBuffer = True Or BufferPtr = UBound(Buffer()) Then If BufferPtr 0 Then If (RowNum + BufferPtr - 1) Rows.Count Then RowNum = 1 ColNum = ColNum + 1 If ColNum 256 Then Exit Sub End If Results.Cells(RowNum, ColNum).Resize(BufferPtr, 1).Value _ = Application.WorksheetFunction.Transpose(Buffer()) RowNum = RowNum + BufferPtr End If BufferPtr = 0 If FlushBuffer = True Then Erase Buffer RowNum = 0 ColNum = 0 Exit Sub Else ReDim Buffer(1 To UBound(Buffer)) End If End If 'construct the next set For i = 1 To UBound(ItemsChosen) sValue = sValue & ", " & vAllItems(ItemsChosen(i), 1) Next i 'and save it in the buffer BufferPtr = BufferPtr + 1 Buffer(BufferPtr) = Mid$(sValue, 3) End Sub 'SavePermutation |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
taking average of each set of all possible combinations
"Tom Mort" wrote...
I'd like to take the average of all possible combinations of an array and get the average of each set of permutations. I will specify how many permutions per set. Ultimately I will then look at the the percentage of these averages above or below a certain number. You're being sloppy with terminology. {1,2,3} and {3,1,2} represent different permutations of the same combination, and all permutations of the same combination have the same average (and sum and count and variance . . ..), so kinda pointless to average different permutations. A set with N elements has 2^N total subsets, and 2^N-2 nontrivial proper subsets, so a set with, say, 32 elements has over 4 billion (US - 10^9) subsets, or combinations of 1 to 32 elements. If you have an array of 32 positive numbers, and you want to know the percentage of all of its subsets with averages less than X, there are better approaches to take than brute force. Obviously all subsets containing only elements each less than X will have averages less than X, and all subsets containing only elements greater than X will have averages greater than X. More generally, all subsets of n < 32 elements which sum to less than n*X will have averages less than X, and all that sum to more than n*X will have averages greater than X. These sorts of optimizations could be done manually, and they'll be a lot faster than code implementing brute force algorithms. |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
taking average of each set of all possible combinations
I did mean perumutions. I realize that the same numbers in a different order would have the same average, but, I'm interested in what percentage of the total permutations would meet a criteria. I guess I'd consider optimization, but I don't want to do it manually as this is intended to be used over and over and distributed to others. Ultimately all the remaining permutions would still have to be crunched so the initial question remains whether this code can be adapted to provide the average. A typical array for this would be <100 datapoints with the number of items in a set 5 or less *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
taking average of each set of all possible combinations
That code generates all the combinations/permutations of a subset of n items
taken from a set of m items where n must be less than or equal to m. It doesn't do all subsets of any size from an array of m items. Not sure what you really want to do. -- Regards, Tom Ogilvy "Tom Mort" wrote in message ... I did mean perumutions. I realize that the same numbers in a different order would have the same average, but, I'm interested in what percentage of the total permutations would meet a criteria. I guess I'd consider optimization, but I don't want to do it manually as this is intended to be used over and over and distributed to others. Ultimately all the remaining permutions would still have to be crunched so the initial question remains whether this code can be adapted to provide the average. A typical array for this would be <100 datapoints with the number of items in a set 5 or less *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
taking average of each set of all possible combinations
"Tom Mort" wrote...
I did mean perumutions. I realize that the same numbers in a different order would have the same average, but, I'm interested in what percentage of the total permutations would meet a criteria. .... And what do you mean by 'percentage of the total permutations would meet a criteria'? If these criteria were any of the typical descriptive statistics, then either 0% or 100% of *permutations* would satisfy them. That is, *all* permutations of the same combination of elements would have the same descriptive statistics. Or are you interested in order-dependent criteria? A typical array for this would be <100 datapoints with the number of items in a set 5 or less I should asked this direct question befo what are you trying to do? Note that SUMPRODUCT(COMBIN(90,{5;4;3;2;1})) returns 46,626,033, which is rather a lot of individual combinations to analyze using brute force techniques. Unless you intend to provide these users with a run-over-night or run-over-the-weekend application, you need to return the initial design stage and rethink what you're trying to do. It's a dead certainty your application as you've described it can't be interactive. |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
taking average of each set of all possible combinations
I thought I sent a reply yesterday but don't see it. What I'm trying to do is set up a program to help companies make an informed decesion as to how many wastewater sample to take for determining extra user charges and to compare with a monthly average limit. Certain business like bakeries etc have stronger sewage than regular households. To make up the difference in costs to treat this sewage, cites normally charge extra for concentrations above a certain amount. Sampling is required to determine the concentration. Also, at least in our community, there will be individual upper concentration limits (a monthly average) that companies are not to exceed. This is to insure that the municipal wasteater plant treatment capacity is not exceeded. In the past just a few samples per quarter were taken to assess the extra charge. As an upper limit comes into play businesses will be urged to perform self monitoring to assess compliance. That leads to the question of how frequently to do this. Each individual business has concentrations within a renge unique to that business. The upper limits (at least intially) are based on the 90th percentile or so of sampling results for the past couple years. On the one hand if a business doesn't have much variation they might be able to only take one sample per month and be reasonably sure they have a representative sample and that this one sample would meet the average limit. On the other extreme to account for variation a sample could be taken every day. That way the most representative sample would be taken and any high anomolies would be averaged out. What I't like is to have something that the businesses could enter their past results for a period of time and the number of samples they proposed to take per month and it would tell them what the odds (assuming current conditions and variability are within the range of past) that the average concentraion from this many samples would be under the average limit. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
taking average of each set of all possible combinations
Tom Mort wrote...
... What I't like is to have something that the businesses could enter their past results for a period of time and the number of samples they proposed to take per month and it would tell them what the odds (assuming current conditions and variability are within the range of past) that the average concentraion from this many samples would be under the average limit. I was afraid of this. How much stats background do you have? When, exactly, would businesses sample themselves? When would the wast water facility or agency sample their output? Does the latter sampl each business on site? Your individual businesses are going to need to establish baselin samples, which means a few days of sampling at specific, relativel frequent intervals. From that, they may be able to establish relativel homogeneous periods within the day when they can take one of tw samples going forward. Anyway, your sampling procedure should *NOT* involve combinatorics. Yo can use standard statisitcal techniques to set sample sizes based o normal distribution assumptions (unless the sample concentration value show pronounced asymmetry) along with regional overall and specifi industry descriptive statistics to supplement individual busines sample values. This isn't my line of work, so I'm not comfortable enough to go int the statistical details. However, I'd be very surprised if there wasn' something available through SIAM that didn't address your needs closel if not exactly (there are one heck of a lot of engineers mathematicians and statisticians working in the chemical an electronics industries, and post-Super Fund, they devote A LOT of tim and attention to matters of managing discharges from their factories) -- Message posted from http://www.ExcelForum.com |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
taking average of each set of all possible combinations
Well for virtually all of the business there is at least many years worth of pretty much randomly taken samples. The samples are 24 hour composite samples and are set to take a sample every 15 minutes. If there is no flow for a period of time there is no sample. We would sample 3 days per quarter. So there is background data already. I have a little statistics background. What is SIAMs? I can get a suggested sample frequency by divding standard deviation by variance, but it doesn't really address averages of sets of samples. Tell me more *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! |
#9
Posted to microsoft.public.excel.programming
|
|||
|
|||
taking average of each set of all possible combinations
Tom Mort wrote...
... I have a little statistics background. What is SIAMs? ... Just SIAM, the Society of Industrial and Applied Mathematics. http://www.siam.or -- Message posted from http://www.ExcelForum.com |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Taking Average ignoring #DIV/0! in the range... | Excel Worksheet Functions | |||
taking an average of every 30 cells | Excel Discussion (Misc queries) | |||
getting combinations | New Users to Excel | |||
Combinations | Excel Worksheet Functions | |||
vba: Taking average of values in one column based on a corresponding value in another column | Excel Programming |