Home |
Search |
Today's Posts |
#1
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
Hello,
I have a set of data that is 14 cells. I want to find the probability that the average of 5 randomly selected cells out of these 14 is over a certain point. HOW?! |
#2
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
You would need to know something about the distribution of the values in the
14 cells. However, let's say you know nothing about the values: You don't know how large or small they are. You don't even know if there are repeated values in the 14 or not. You can STILL get an estimate of the probability by performing a Monti Carlo analysis. The first step is to make data. Pick 5 at random, calculate the average and store the result. Repeat this many, many times. Say we end up with data from A1 thru A10000 (each cell containing an average) Now knowing the MAX and MIN of the 10,000 sample we make bins for the data. The third step is to add counts to the bins based upon the 10,000 values. The bins represent a probability distribution. The last step is to compare the reference value with the bin counts. -- Gary''s Student - gsnu200789 "HelpMePlease" wrote: Hello, I have a set of data that is 14 cells. I want to find the probability that the average of 5 randomly selected cells out of these 14 is over a certain point. HOW?! |
#3
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
"Gary''s Student" wrote:
You would need to know something about the distribution of the values in the 14 cells. However, let's say you know nothing about the values: You don't know how large or small they are. You don't even know if there are repeated values in the 14 or not. You can STILL get an estimate of the probability by performing a Monti Carlo analysis. The first step is to make data. Pick 5 at random, calculate the average and store the result. Repeat this many, many times. Say we end up with data from A1 thru A10000 (each cell containing an average) Now knowing the MAX and MIN of the 10,000 sample we make bins for the data. The third step is to add counts to the bins based upon the 10,000 values. The bins represent a probability distribution. The last step is to compare the reference value with the bin counts. -- Gary''s Student - gsnu200789 Just wondering if you could provide an example to the answer you provided that would visually show an elaboration. Not to hijack this post, but I made a post earlier "Subject: Predict/Probability" that was, I think, similar but I got no response. I provided a link to a sample xls file and I would love to finally see a good visual make up. Thank you now and well into the future, Luke |
#4
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
Hi Luke:
Thanks for your interest. If you feel comfortable with VBA, then I will demo a solution completely in VBA. All you would need to do is to paste the macros into a fresh, new workbook. The macros would create the data on a worksheet, run the Monti Carlo, prepare the binning and present the probability distribution. Check back tomorrow!! -- Gary''s Student - gsnu200789 "Luke" wrote: "Gary''s Student" wrote: You would need to know something about the distribution of the values in the 14 cells. However, let's say you know nothing about the values: You don't know how large or small they are. You don't even know if there are repeated values in the 14 or not. You can STILL get an estimate of the probability by performing a Monti Carlo analysis. The first step is to make data. Pick 5 at random, calculate the average and store the result. Repeat this many, many times. Say we end up with data from A1 thru A10000 (each cell containing an average) Now knowing the MAX and MIN of the 10,000 sample we make bins for the data. The third step is to add counts to the bins based upon the 10,000 values. The bins represent a probability distribution. The last step is to compare the reference value with the bin counts. -- Gary''s Student - gsnu200789 Just wondering if you could provide an example to the answer you provided that would visually show an elaboration. Not to hijack this post, but I made a post earlier "Subject: Predict/Probability" that was, I think, similar but I got no response. I provided a link to a sample xls file and I would love to finally see a good visual make up. Thank you now and well into the future, Luke |
#5
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
I have a set of data that is 14 cells. I want to find the probability
that the average of 5 randomly selected cells out of these 14 is over a certain point. You can STILL get an estimate of the probability by performing a Monti Carlo analysis. Just wondering if you could provide an example to the answer you provided that would visually show an elaboration. I tried doing Monte Carlo this way. Put the 14 original numbers in A1:N1. Put the "certain point" in O1. In A2 put =RAND() and extend to N2. In A3 put =--(RANK(A2,2:2)<6) and extend to N3. In P3 put =--(SUMPRODUCT($A$1:$N$1,$A3:$N3)/5$O$1) In Q1 put =AVERAGE(P:P) Copy the pair of rows 2:3 and paste downward for thousands of rows. The probability estimate is in Q1. |
#6
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
Very Nice, I will whatch for it.. I am just learning VBA, very light, but I
will be brave. I hope all this helps "HelpMePlease" as well. Could you look at http://www.freefilehosting.net/download/3fg7a and see if your VBA will apply to it? You can read about what I am looking for in my post on 4/22/2008 Subject: "Predict/Probability" Thank you Luke "Gary''s Student" wrote: Hi Luke: Thanks for your interest. If you feel comfortable with VBA, then I will demo a solution completely in VBA. All you would need to do is to paste the macros into a fresh, new workbook. The macros would create the data on a worksheet, run the Monti Carlo, prepare the binning and present the probability distribution. Check back tomorrow!! -- Gary''s Student - gsnu200789 "Luke" wrote: "Gary''s Student" wrote: You would need to know something about the distribution of the values in the 14 cells. However, let's say you know nothing about the values: You don't know how large or small they are. You don't even know if there are repeated values in the 14 or not. You can STILL get an estimate of the probability by performing a Monti Carlo analysis. The first step is to make data. Pick 5 at random, calculate the average and store the result. Repeat this many, many times. Say we end up with data from A1 thru A10000 (each cell containing an average) Now knowing the MAX and MIN of the 10,000 sample we make bins for the data. The third step is to add counts to the bins based upon the 10,000 values. The bins represent a probability distribution. The last step is to compare the reference value with the bin counts. -- Gary''s Student - gsnu200789 Just wondering if you could provide an example to the answer you provided that would visually show an elaboration. Not to hijack this post, but I made a post earlier "Subject: Predict/Probability" that was, I think, similar but I got no response. I provided a link to a sample xls file and I would love to finally see a good visual make up. Thank you now and well into the future, Luke |
#7
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
On May 28, 12:44*pm, Gary''s Student
wrote: You can STILL get an estimate of the probability by performing a Monti Carlo analysis. That's Monte Carlo, with an "e", at least in English. The first step is to make data. *Pick 5 at random, calculate the average and store the result. *Repeat this many, many times. *Say we end up with data from A1 thru A10000 (each cell containing an average) If you are going to go to the trouble of generating that many 5-tuple selections, you might as well generate all selections of 5 out of 14. There are only 2002. The third step is to add counts to the bins based upon the 10,000 values. *The bins represent a probability distribution. Or simply sort the 2002 data (averages of 5).... The last step is to compare the reference value with the bin counts. ....And count the number greater than the reference value. (If you are going to store the 2002-data sorted table in a worksheet, you could use LOOKUP.) But I don't think that is a good general solution. It may or may not be what the instructor is looking for, depending on what class this is for. I will post a separate posting with a more general statistical approach, if someone else does not beat me to it. I don't have time at the moment. (I might even get around to posting the VB function to generate all 2002 combinations. It's quite straight-forward.) |
#8
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
PS....
On May 28, 4:31*pm, I wrote: On May 28, 12:44*pm, Gary''s Student wrote: You can STILL get an estimate of the probability by performing a Monti Carlo analysis. [....] If you are going to go to the trouble of generating that many 5-tuple selections, you might as well generate all selections of 5 out of 14. There are only 2002. Not to disparage "Gary's" approach or discourage him from posting the VB Monte Carlo simulation. I, too, would be interested in seeing the implementation. I struggle with VB because I use it so infrequently. |
#9
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
On May 28, 4:31*pm, I wrote:
If you are going to go to the trouble of generating that many 5-tuple selections, you might as well generate all selections of 5 out of 14. There are only 2002. [....] Or simply sort the 2002 data (averages of 5).... [....] ....And count the number greater than the reference value. See the VB sub gendata below. It's a hack. Skilled VB programmers might offer improvements. Usage: create a new worksheet called "data". Then highlight the range with the 14 numbers in the original worksheet; these must be horizontal (i.e. part of a row). Then execute the macro. The probability that the average of 5 random numbers out of the 14 is greater than "x" is (replace "x" with a number): =match(x, data!$a1:$a2002) / 2002 The macro is fast enough. But as it turns out, 90+% of the time is spent in sorting the resulting array. In this case, it is faster to use a function, even if we recreate the 2002 possible averages each time. (Programming note: that could be avoided by storing into a Public variable the first time.) See the VB function genprob below. Usage: first argument is "x"; second argument is the 14-number range (again, horizontal). In that case it would be useful to know the min and max 5-tuple averages. See the VB function minmax below. Usage: first argument is -1 for min, 1 for max; second argument is the 14-number range (horizontal). But the VB sub gendata is useful because it "returns" the entire array of 2002 averages. That allows us to do some statistical analysis. For example, use the Excel Histogram add-in (Data Analysis) and Chart Wizard to graph the distribution of the averages. You will discover that it is a normal distribution. That is an important discovery. It demonstrates a principle of the Central Limit Theorem that can be used to develop a more general statistical solution. If this is for a class in statistics and probability, I suspect that is what your instructor wants you to use for the solution. Obviously, more needs to be said. But I have run out of time. I'll try to post back later. But this is best explained by the likes of Jerry Lewis and Mike Middleton, if they are reading this thread. ----- Sub gendata() Dim pop As Variant Dim x(1 To 2002) As Double Dim i1 As Integer, i2 As Integer, i3 As Integer, i4 As Integer, i5 As Integer Dim i As Integer, j As Integer, k As Integer Dim t As Double Dim n As Long If Selection.Count < 14 Then Exit Sub pop = Selection.Value If UBound(pop, 1) < 1 Then Exit Sub ' must be horizontal n = 0 For i1 = 1 To 10 For i2 = i1 + 1 To 11 For i3 = i2 + 1 To 12 For i4 = i3 + 1 To 13 For i5 = i4 + 1 To 14 n = n + 1 x(n) = (pop(1, i1) + pop(1, i2) + pop(1, i3) + pop(1, i4) + pop(1, i5)) / 5 Next i5: Next i4: Next i3: Next i2: Next i1 ' sort in ascending order For i = 1 To n - 1 k = i For j = i + 1 To n If x(j) < x(k) Then k = j Next j If k < i Then t = x(i): x(i) = x(k): x(k) = t End If Next i Worksheets("Data").Range("A1:A2002").Value = WorksheetFunction.Transpose(x) End Sub Function genprob(ref As Double, rng As Range) As Double Dim pop As Variant Dim x As Double Dim i1 As Integer, i2 As Integer, i3 As Integer, i4 As Integer, i5 As Integer Dim cnt As Long, n As Long If rng.Count < 14 Then Exit Function pop = rng.Value If UBound(pop, 1) < 1 Then Exit Function ' must be horizontal n = 0: cnt = 0 For i1 = 1 To 10 For i2 = i1 + 1 To 11 For i3 = i2 + 1 To 12 For i4 = i3 + 1 To 13 For i5 = i4 + 1 To 14 n = n + 1 x = (pop(1, i1) + pop(1, i2) + pop(1, i3) + pop(1, i4) + pop(1, i5)) / 5 If x <= ref Then cnt = cnt + 1 Next i5: Next i4: Next i3: Next i2: Next i1 genprob = cnt / n End Function Function minmax(mm As Integer, rng As Range) As Double Dim pop As Variant Dim x As Double Dim i1 As Integer, i2 As Integer, i3 As Integer, i4 As Integer, i5 As Integer If rng.Count < 14 Then Exit Function pop = rng.Value If UBound(pop, 1) < 1 Then Exit Function ' must be horizontal xmin = 1E+308 xmax = -1E+308 For i1 = 1 To 10 For i2 = i1 + 1 To 11 For i3 = i2 + 1 To 12 For i4 = i3 + 1 To 13 For i5 = i4 + 1 To 14 x = (pop(1, i1) + pop(1, i2) + pop(1, i3) + pop(1, i4) + pop(1, i5)) / 5 If x < xmin Then xmin = x If x xmax Then xmax = x Next i5: Next i4: Next i3: Next i2: Next i1 minmax = IIf(mm < 0, xmin, xmax) End Function |
#10
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
Errata....
On May 29, 9:15*am, I wrote: *The probability that the average of 5 random numbers out of the 14 is greater than "x" is (replace "x" with a number): =match(x, data!$a1:$a2002) / 2002 Actually, that is the probability of "less than or equal to". Correction: =1 - match(x, data!$a1:$a2002) / 2002 Similarly, in the VB function genprob, change the comparison to: * If x ref Then cnt = cnt + 1 (Caveat emptor: if you are cut-and-pasting the VB code, be aware that some lines might have been broken arbitrarily when I posted. The VBE will flag this errors; the remedy is simply to go to the end of the first line and press Space and Delete.) |
#11
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
Sub monti()
' gsnuxx Set data = Range("A1:A14") Set rnnd = Range("B1:B14") Set sortarea = Union(data, rnnd) Set sample = Range("A1:A5") ' put the data in A1 thru A14 Cells.Clear data.Value = Application.Transpose(Array(1, 1, 2, 3, 4, 5, 6, 20, 21, 22, 23, 24, 25, 26)) rnnd.Formula = "=rand()" ' run the Monti Carlo Application.ScreenUpdating = False For i = 1 To 10000 sortarea.Sort Key1:=Range("B1"), header:=xlNo Cells(i, "C").Value = Application.WorksheetFunction.Average(sample) Next ' Put in the analysis equations Range("F1").Formula = "=max(C:C)" Range("F2").Formula = "=min(C:C)" Range("D1").Formula = "=F2" Range("D2").Formula = "=D1+(F$1-F$2)/10" Range("D2").Copy Range("D3:D11") Range("E2").Formula = "=COUNTIF(C:C,""<="" & D2)" Range("E2").Copy Range("E3:E11") Application.ScreenUpdating = True End Sub 1. cols A & B are filled 2. cols A & B are "shuffled" 10000 times and the average of the A1 thru A5 is recorded 3. analysis equations are pasted in. Here is what D2 thru E11 look like in a typical run: 2.2 4.38 93 6.56 307 8.74 1284 10.92 2801 13.1 4912 15.28 7228 17.46 8641 19.64 9701 21.82 9904 24 10000 Column D are the bin tops and column E are the counts. This says that there is a 100% probability that a randomly chosen 5 sample will average at or below 24 There is a 49.12% probability that the average will at or below 13.1, etc. If you want a specific value not at a bin edge (say 12) then use: =COUNTIF(C:C,"<=12") -- Gary''s Student - gsnu200789 "Luke" wrote: Very Nice, I will whatch for it.. I am just learning VBA, very light, but I will be brave. I hope all this helps "HelpMePlease" as well. Could you look at http://www.freefilehosting.net/download/3fg7a and see if your VBA will apply to it? You can read about what I am looking for in my post on 4/22/2008 Subject: "Predict/Probability" Thank you Luke "Gary''s Student" wrote: Hi Luke: Thanks for your interest. If you feel comfortable with VBA, then I will demo a solution completely in VBA. All you would need to do is to paste the macros into a fresh, new workbook. The macros would create the data on a worksheet, run the Monti Carlo, prepare the binning and present the probability distribution. Check back tomorrow!! -- Gary''s Student - gsnu200789 "Luke" wrote: "Gary''s Student" wrote: You would need to know something about the distribution of the values in the 14 cells. However, let's say you know nothing about the values: You don't know how large or small they are. You don't even know if there are repeated values in the 14 or not. You can STILL get an estimate of the probability by performing a Monti Carlo analysis. The first step is to make data. Pick 5 at random, calculate the average and store the result. Repeat this many, many times. Say we end up with data from A1 thru A10000 (each cell containing an average) Now knowing the MAX and MIN of the 10,000 sample we make bins for the data. The third step is to add counts to the bins based upon the 10,000 values. The bins represent a probability distribution. The last step is to compare the reference value with the bin counts. -- Gary''s Student - gsnu200789 Just wondering if you could provide an example to the answer you provided that would visually show an elaboration. Not to hijack this post, but I made a post earlier "Subject: Predict/Probability" that was, I think, similar but I got no response. I provided a link to a sample xls file and I would love to finally see a good visual make up. Thank you now and well into the future, Luke |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
probability | Excel Discussion (Misc queries) | |||
probability | Excel Worksheet Functions | |||
Probability | Excel Discussion (Misc queries) | |||
Advanced Conditional Formatting Ideas Needed! (ok, maybe not that advanced...) | Excel Discussion (Misc queries) | |||
Urgent! Probability? | New Users to Excel |