![]() |
Bias in rand for excel 07
I'm testing random numbers for games using excel 07 generator, but there is
too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias |
Bias in rand for excel 07
Here are my results from excel 2003
1 1652 2 1715 3 1689 4 1668 5 1661 6 1615 Sub countrand() Dim numbers(6) For i = 1 To 6 numbers(i) = 0 Next i For i = 1 To 10000 myRand = Int(6 * Rnd()) + 1 numbers(myRand) = numbers(myRand) + 1 Next i For i = 1 To 6 Range("A" & i) = i Range("B" & i) = numbers(i) Next i End Sub "yttrias" wrote: I'm testing random numbers for games using excel 07 generator, but there is too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias |
Bias in rand for excel 07
And what was your sample size?
-- Bernard V Liengme Microsoft Excel MVP www.stfx.ca/people/bliengme remove caps from email "yttrias" wrote in message ... I'm testing random numbers for games using excel 07 generator, but there is too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias |
Bias in rand for excel 07
On Dec 16, 2:22 pm, "Bernard Liengme"
wrote: And what was your sample size? -- Bernard V Liengme Microsoft Excel MVPwww.stfx.ca/people/bliengme remove caps from email "yttrias" wrote in message ... I'm testing random numbers for games using excel 07 generator, but there is too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias The flaw in the MS random number generator has been known for some time. A description is included in this MS info sheet: http://support.microsoft.com/kb/829208 I use a simulation package that has it's own generator. But there are probably others you can download via a web search. You may want to run statistical tests on ones that you are considering for validation. SteveM |
Bias in rand for excel 07
But http://support.microsoft.com/kb/828795/ suggests RAND is OK in XL 2003
and 2007 -- Bernard V Liengme Microsoft Excel MVP www.stfx.ca/people/bliengme remove caps from email "SteveM" wrote in message ... On Dec 16, 2:22 pm, "Bernard Liengme" wrote: And what was your sample size? -- Bernard V Liengme Microsoft Excel MVPwww.stfx.ca/people/bliengme remove caps from email "yttrias" wrote in message ... I'm testing random numbers for games using excel 07 generator, but there is too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias The flaw in the MS random number generator has been known for some time. A description is included in this MS info sheet: http://support.microsoft.com/kb/829208 I use a simulation package that has it's own generator. But there are probably others you can download via a web search. You may want to run statistical tests on ones that you are considering for validation. SteveM |
Bias in rand for excel 07
yttrias -
Which "excel 07 generator" are you using? RAND() worksheet function, or RND function in VBA, or Excel 2007, Data ribbon, Data Analysis, Random Number Generation tool ? - Mike Middleton http://www.DecisionToolworks.com Decision Analysis Add-ins for Excel "yttrias" wrote in message ... I'm testing random numbers for games using excel 07 generator, but there is too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias |
Bias in rand for excel 07
I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000
consecutive values at 20 different locations throughout the range of 1M. The average of those 20 windows in the 1M was 1587, where the expected average should be 1667. -- yttrias "Bernard Liengme" wrote: And what was your sample size? -- Bernard V Liengme Microsoft Excel MVP www.stfx.ca/people/bliengme remove caps from email "yttrias" wrote in message ... I'm testing random numbers for games using excel 07 generator, but there is too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias |
Bias in rand for excel 07
I don't know about the OP, but Joel's posted code used the VBA Rnd()
function. Prior to 2007, there were 3 distinct random number generators, the worksheet function RAND(), VBA Rnd(), and the ATP random number generator. None were very good before 2003, but their problems would arise as high order autocorrelations between "random" numbers not as the kind of problems the Joel was testing for. It is unlikely to show up in the process that the OP seems to describe. Therefore I would want to know how the OP generated "integer numbers between 1 and 6" and what the OP was using as expected values. A decent, but not great algorithm was implemented for worksheet RAND() (requires SP1 patch to work properly) and carried into 2007. I have seen no evidence that VBA Rnd() has ever been upgraded. RANDBETWEEN() became a worksheet function in 2007, and therefore probably switched from using ATP random number generator to using the worksheet RAND() function, though I have seen no confirmation of this. Joel's simulation shows departures from expectation that are NOT statistically significant. Generating 10000 sums of 2 values using worksheet RAND() and expected values of (6-ABS(7-tot2))/36, I tried several reps in 2003 with no statistically significant departure from expectation. The following VBA code does 10 reps of 10000 using VBA Rnd(), and only one rep was statistically different from expectation (again in 2003). Therefore, I am somewhat skeptical of the OP's claims until more information is provided. Jerry Sub tryit() Application.ScreenUpdating = False ' to speed execution For j = 1 To 10 For i = 1 To 10000 x = Fix(Rnd() * 6) + Fix(Rnd() * 6) + 2 ' sum of two random integers between 1 and 6 Cells(x, j) = Cells(x, j) + 1 Next i Next j Application.ScreenUpdating = True End Sub "Mike Middleton" wrote: yttrias - Which "excel 07 generator" are you using? RAND() worksheet function, or RND function in VBA, or Excel 2007, Data ribbon, Data Analysis, Random Number Generation tool ? - Mike Middleton http://www.DecisionToolworks.com Decision Analysis Add-ins for Excel "yttrias" wrote in message ... I'm testing random numbers for games using excel 07 generator, but there is too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias |
Bias in rand for excel 07
You are still providing too little information for us to understand what you
are doing. There is no RAND() function in VBA. In VBA, you either used the VBA Rnd() function, or the worksheet RAND() function via Evaluate("RAND()"). As noted in my earlier post, in 2003 the worksheet RAND() function was far superior to the VBA Rnd(). I have seen no evidence that either function changed between 2003 and 2007. Your original post discussed the sum of two (presumably independent) random integers between 1 and 6 (such as the total from rolling two fair dice). The expected value of a single such sum would be 7. It is unclear how that relates to either your observed average of 1587 or expected average of 1667 for an "average of 20 windows of 40,000 consecutive values. It sounds as though your VBA would involve only a few lines of code. If you paste that code into your reply, we can see exactly what you did. Jerry "yttrias" wrote: I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000 consecutive values at 20 different locations throughout the range of 1M. The average of those 20 windows in the 1M was 1587, where the expected average should be 1667. -- yttrias "Bernard Liengme" wrote: And what was your sample size? -- Bernard V Liengme Microsoft Excel MVP www.stfx.ca/people/bliengme remove caps from email "yttrias" wrote in message ... I'm testing random numbers for games using excel 07 generator, but there is too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias |
Bias in rand for excel 07
I ran 1 million samples using RAND in VBA in excel 07. Then I sampled
40,000 consecutive values at 20 different locations throughout the range of 1M. The average of those 20 windows in the 1M was 1587, where the expected average should be 1667. How are you using th Randomize statement? If you are executing it more than one time, that could be skewing your results. Consider this posting I have posted in the past... The Randomize statement should not be executed more than once for the lifespan of the object that is executing your code. Doing so more often actually makes the data less random than if Randomize is only run once. For so few colors, and the probable use the OP wants to use the Rnd function for, it will more than likely not matter here, but it is the concept that is important to know. I can't demo it within Excel because I am not aware of an available object where I can set the color of individual pixels (like a PictureBox control in the compiled version of VB for those who have worked with that language); however, I do have a VB program that demonstrates this fact visually. For those of you having access to the compiled versions of VB5 or VB6, here is posting I have offered over in the compiled VB newsgroups in the past that demonstrates this fact... Running Randomize multiple times ends up producing a less random set of numbers. To see the problem visually, use this code (which is a modification of a routine Bob Butler once posted). Start a new project and put two PictureBox'es on your Form (use the default names for everything and placement of the PictureBox'es is not important). Paste the following code into the Form's code window. The PictureBox on the left results from using Randomize only once, the one on the right uses it repeatedly. Both PictureBox displays are produced from the same looping code with the only difference being the use of the Randomize statement Ignoring the pronounce vertical areas (not sure what that is, probably some kind of boundary rounding problem), for which one does the distribution of colors look more "random"? Rick Const SCALESIZE = 3 Private Sub Form_Load() Picture1.ScaleMode = 3 Picture2.ScaleMode = 3 Randomize Picture1.Move 0, 0, _ 128 * Screen.TwipsPerPixelX * SCALESIZE, _ 128 * Screen.TwipsPerPixelY * SCALESIZE Picture2.Move Picture1.Width, 0, _ 128 * Screen.TwipsPerPixelX * SCALESIZE, _ 128 * Screen.TwipsPerPixelY * SCALESIZE Me.Width = 2.02 * Picture1.Width Me.Height = 1.1 * Picture1.Height End Sub Private Sub Picture1_Paint() Dim i As Long Dim j As Long Dim colr As Long Dim bitmask As Long For i = 0 To Picture1.ScaleHeight Step SCALESIZE For j = 0 To Picture1.ScaleWidth Step SCALESIZE colr = Rnd * 16711680 Picture1.Line (j, i)-Step(SCALESIZE, _ SCALESIZE), colr, BF Next j Next i For i = 0 To Picture2.ScaleHeight Step SCALESIZE For j = 0 To Picture2.ScaleWidth Step SCALESIZE Randomize colr = Rnd * 16711680 Picture2.Line (j, i)-Step(SCALESIZE, _ SCALESIZE), colr, BF Next j Next i End Sub |
Bias in rand for excel 07
"Rick Rothstein (MVP - VB)" wrote in message
For those of you having access to the compiled versions of VB5 or VB6, here is posting I have offered over in the compiled VB newsgroups in the past that demonstrates this fact... Running Randomize multiple times ends up producing a less random set of numbers. To see the problem visually, use this code (which is a modification of a routine Bob Butler once posted). Start a new project and put two PictureBox'es on your Form (use the default names for everything and placement of the PictureBox'es is not important). Paste the following code into the Form's code window. The PictureBox on the left results from using Randomize only once, the one on the right uses it repeatedly. Both PictureBox displays are produced from the same looping code with the only difference being the use of the Randomize statement Ignoring the pronounce vertical areas (not sure what that is, probably some kind of boundary rounding problem), for which one does the distribution of colors look more "random"? Rick In particular this bit - Ignoring the pronounced vertical areas (not sure what that is, probably some kind of boundary rounding problem) Is it some rounding problem or could it be related to the non random nature of random, even with the single Randomize. Actually I didn't notice those vertical areas you refer to at first in the left box, however I amended a couple of values in your code, in particular changed colr = Rnd * 16711680 to colr = Rnd * 16777215, also increased the sizes by 1 pixel to exaggerate. Now I get very distinct not entirely random vertical areas in the left box. For the second box I changed Randomize in each loop to Randomize 1, just for fun. Amended code - Const SCALESIZE = 3 Private Sub Form_Load() Const Z As Long = 129 ' original 128 Me.Left = 10 Picture1.ScaleMode = 3 Picture2.ScaleMode = 3 Randomize Picture1.Move 0, 0, _ Z * Screen.TwipsPerPixelX * SCALESIZE, _ Z * Screen.TwipsPerPixelY * SCALESIZE Picture2.Move Picture1.Width, 0, _ Z * Screen.TwipsPerPixelX * SCALESIZE, _ Z * Screen.TwipsPerPixelY * SCALESIZE Me.Width = 2.02 * Picture1.Width Me.Height = 1.1 * Picture1.Height End Sub Private Sub Picture1_Paint() Dim i As Long Dim j As Long Dim colr As Long Dim bitmask As Long Const C As Long = 16777215 ' original 16711680 For i = 0 To Picture1.ScaleHeight Step SCALESIZE For j = 0 To Picture1.ScaleWidth Step SCALESIZE colr = Rnd * C Picture1.Line (j, i)-Step(SCALESIZE, _ SCALESIZE), colr, BF Next j Next i For i = 0 To Picture2.ScaleHeight Step SCALESIZE For j = 0 To Picture2.ScaleWidth Step SCALESIZE Randomize ' original Randomize 1 colr = Rnd * C Picture2.Line (j, i)-Step(SCALESIZE, _ SCALESIZE), colr, BF Next j Next i End Sub Regards, Peter T |
Bias in rand for excel 07
A similar demo for a VBA Userform -
Option Explicit Private Declare Function FindWindow Lib "user32" _ Alias "FindWindowA" _ (ByVal lpClassName As String, _ ByVal lpWindowName As String) As Long Private Declare Function SetPixel Lib "gdi32" _ (ByVal hdc As Long, ByVal x As Long, _ ByVal y As Long, ByVal crColor As Long) As Long Private Declare Function GetPixel Lib "gdi32" _ (ByVal hdc As Long, ByVal x As Long, ByVal y As Long) As Long Private Declare Function GetDC Lib "user32" ( _ ByVal hwnd As Long) As Long Private Declare Function ReleaseDC Lib "user32" ( _ ByVal hwnd As Long, ByVal hdc As Long) As Long Private Sub PaintPixels() Dim hwnd As Long, hdc As Long Dim tp As Long, lt As Long Dim x As Long, y As Long Dim colr As Long Const Z As Long = 128 * 2 - 1 Const C As Long = 16777215 Me.Left = 10: Me.Top = 10 Me.Width = (Z * 2) * 0.75 + 45: Me.Height = Z * 0.75 + 60 ' if form is too small, change 0.75 to 1 or 1.25 hwnd = FindWindow("ThunderDFrame", Me.Caption) hdc = GetDC(hwnd) tp = Me.Top + 15 lt = Me.Left + 10 Randomize For y = tp To tp + Z For x = lt To lt + Z colr = Rnd * C SetPixel hdc, x, y, colr Next Next lt = lt + Z + 15 For y = tp To tp + Z For x = lt To lt + Z Randomize 1 colr = Rnd * C SetPixel hdc, x, y, colr Next Next ReleaseDC hwnd, hdc End Sub Private Sub UserForm_Activate() Me.Caption = "Click me to (re-) PaintPixels" PaintPixels End Sub Private Sub UserForm_Click() ' Me.Repaint PaintPixels End Sub Peter T |
Bias in rand for excel 07
Actually I didn't notice those vertical areas you refer to at first in the
left box, however I amended a couple of values in your code, in particular changed colr = Rnd * 16711680 to colr = Rnd * 16777215, also increased the sizes by 1 pixel to exaggerate. It was quite some time ago when I developed that particular demonstration, so I forgot some of the original details. Actually, now that I think about it, I believe I adjusted the numbers to minimize the vertical streak I mentioned and didn't think about it when I grabbed the text from a different message I had posted about the original (pre-modified) code. The constant multiplier for the Rnd function was the key, as I remember. Nice thought about the 1 pixel adjustment. Rick |
Bias in rand for excel 07
"Rick Rothstein (MVP - VB)" wrote in message
Actually I didn't notice those vertical areas you refer to at first in the left box, however I amended a couple of values in your code, in particular changed colr = Rnd * 16711680 to colr = Rnd * 16777215, also increased the sizes by 1 pixel to exaggerate. It was quite some time ago when I developed that particular demonstration, so I forgot some of the original details. Actually, now that I think about it, I believe I adjusted the numbers to minimize the vertical streak I mentioned and didn't think about it when I grabbed the text from a different message I had posted about the original (pre-modified) code. The constant multiplier for the Rnd function was the key, as I remember. Nice thought about the 1 pixel adjustment. Rick This bit - The constant multiplier for the Rnd function was the key, as I remember. The constant multiplier, depending on what it is, helps to highlight there is little if anything random about vb/vba's Rnd. Instead it seems to generate a sequence, albeit not an immediately obvious one. I guess this is all fully documented and for most purposes of little consequence. I examined the colours generated in the vba-userform demo (posted nearby). In visually obvious bands adjacent RGB's in a series might be something like (say) Red & Green are identical but Blue increments uniformly, then repeat with a small increment to the G attribute. Or there might be an obvious chequerboard pattern of colours in a section, with every other colour almost identical. Regards, Peter T |
Bias in rand for excel 07
Ah, so that is how you draw into a UserForm... ThunderDFrame... I'll have to
remember that. Thanks for performing the conversion. First, I removed the "1" off of the Randomize statement in the second loop so that both Randomize statements draw from the Timer for their seed values. Yes, this scatters the bar effect on the second picture, but I think it is a fairer display of what is going on. Second, I think if you change the Const C assignment statement in the PaintPixel procedure to this... Const C As Long = 16711680 you get a much starker (and more amazing) contrast between the two displays, especially with the the Randomizer seed change mention above, especially when you repeatedly click the UserForm. Yes, this removes the sharp vertical lines that the seed value of 1 produced, but it still shows (and more fairly I think) the sharp difference in randomization that occurs if you use the Randomize statement only once. The modified code implementing the above is shown below my signature. Again, thanks for performing the compiled VB to Excel VBA conversion... you did a nice job with it. I will be using your coded translation (with appropriate acknowledgement to you, of course) should a similar question rise in the future. Rick Private Declare Function FindWindow Lib "user32" _ Alias "FindWindowA" _ (ByVal lpClassName As String, _ ByVal lpWindowName As String) As Long Private Declare Function SetPixel Lib "gdi32" _ (ByVal hDc As Long, _ ByVal x As Long, _ ByVal y As Long, _ ByVal crColor As Long) As Long Private Declare Function GetPixel Lib "gdi32" _ (ByVal hDc As Long, _ ByVal x As Long, _ ByVal y As Long) As Long Private Declare Function GetDC Lib "user32" ( _ ByVal hWnd As Long) As Long Private Declare Function ReleaseDC Lib "user32" ( _ ByVal hWnd As Long, ByVal hDc As Long) As Long Private Sub PaintPixels() Dim hWnd As Long, hDc As Long Dim tp As Long, lt As Long Dim x As Long, y As Long Dim colr As Long Const Z As Long = 128 * 2 - 1 Const C As Long = 16711680 Me.Left = 10: Me.Top = 10 Me.Width = (Z * 2) * 0.75 + 45: Me.Height = Z * 0.75 + 60 ' if form is too small, change 0.75 to 1 or 1.25 hWnd = FindWindow("ThunderDFrame", Me.Caption) hDc = GetDC(hWnd) tp = Me.Top + 15 lt = Me.Left + 10 Randomize For y = tp To tp + Z For x = lt To lt + Z colr = Rnd * C SetPixel hDc, x, y, colr Next Next lt = lt + Z + 15 For y = tp To tp + Z For x = lt To lt + Z Randomize colr = Rnd * C SetPixel hDc, x, y, colr Next Next ReleaseDC hWnd, hDc End Sub Private Sub UserForm_Activate() Me.Caption = "Click me to (re-) PaintPixels" PaintPixels End Sub Private Sub UserForm_Click() ' Me.Repaint PaintPixels End Sub |
Bias in rand for excel 07
I am grateful to those of you who have responded so quickly to my OP. I hope you will be able to point out the errors of my ways. The following is the code sequence for the generation of two sets of 1-million integers between 1 and 6, and their sum. Then, the subsequent code counts the number of sevens in a sample of 40,000 sums (divided by 40,000) to give decimal representation to be compared with ideal ratio of 1/6. I ran the latter code 20 times at arbitrary locations, and averaged those results. Sub OneMillionEvents() Range("A1:C1000000").Select Selection.ClearContents Selection.NumberFormat = "0" Range("A1").Select ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)" Range("B1").Select ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)" Range("C1").Select ActiveCell.FormulaR1C1 = "=RC[-2]+RC[-1]" Range("A1:C1000000").Select Selection.FillDown Calculate End Sub Sub CountSevens() Dim i, n As Integer Range("A1").Select ActiveCell(1, 5) = 40000 ActiveCell(1, 6) = 0 n = 0 For i = 825001 to 865000 If ActiveCell(i, 3) = 7 Then n = n + 1 End If Next i ActiveCell(1, 6) = n / ActiveCell(1, 5) End Sub Regards, Yttrias -- yttrias "Jerry W. Lewis" wrote: You are still providing too little information for us to understand what you are doing. There is no RAND() function in VBA. In VBA, you either used the VBA Rnd() function, or the worksheet RAND() function via Evaluate("RAND()"). As noted in my earlier post, in 2003 the worksheet RAND() function was far superior to the VBA Rnd(). I have seen no evidence that either function changed between 2003 and 2007. Your original post discussed the sum of two (presumably independent) random integers between 1 and 6 (such as the total from rolling two fair dice). The expected value of a single such sum would be 7. It is unclear how that relates to either your observed average of 1587 or expected average of 1667 for an "average of 20 windows of 40,000 consecutive values. It sounds as though your VBA would involve only a few lines of code. If you paste that code into your reply, we can see exactly what you did. Jerry "yttrias" wrote: I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000 consecutive values at 20 different locations throughout the range of 1M. The average of those 20 windows in the 1M was 1587, where the expected average should be 1667. -- yttrias "Bernard Liengme" wrote: And what was your sample size? -- Bernard V Liengme Microsoft Excel MVP www.stfx.ca/people/bliengme remove caps from email "yttrias" wrote in message ... I'm testing random numbers for games using excel 07 generator, but there is too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias |
Bias in rand for excel 07
The code was very helpful. I agree that the expcted value in F1 should be
1/6=0.666... and that 0.1587 is a sigificant departure from expectation (the missing decimal points hindered my understanding). I am surprised that CountSevens() worked at all, since i is declared an integer, but takes values that are outside the valid range for an integer--that may have been the problem. There is a slight bias built into using either the INT() or TRUNC() functions. For example =((2-2^-52)-2) is <0, so =INT(2-2^-52) should be 1, but MS, in trying to hide the binary underpinnings, returns 2. Therefore there will be a slight reduction in the number of 6's and a slight increase in the number of 1's in your pseudo random integers, but these slight biases should not be statistically detectable in an experiment of your size. If you are concerned, you can avoid them by using an intermediate cell to hold the =RAND() values which you would then transform with the slightly more complicated formula =IF(((x*6)-INT(x*6))<0,INT(x*6)-1,INT(x*6)) Some recalculations of the random integers occur during the execution of CountSevens(). It is not clear to me whether these might be occurring during the execution of the For/Next loop; regardless, I don't see how it could cause an undercount of 7's. Here is an alternate VBA code, that simplifies and combines the work of both OneMillionEvents() and CountSevens(). It also replaces the static value in F1 with a dynamic formula that will update whenever the RAND() functions recalculate. I tried about a dozen reps in Excel 2007, and got only one rep that was statistically different from expectation. Sub SimplerVersion() With Range("A1:C1000000") .ClearContents .NumberFormat = "0" End With Range("A1:B1000000").FormulaR1C1 = "=1+INT(RAND()*6)" Range("C1:C1000000").FormulaR1C1 = "=RC[-2]+RC[-1]" Range("F1").FormulaArray = "=AVERAGE((R825001C3:R865000C3=7)*1)" End Sub Jerry "yttrias" wrote: I am grateful to those of you who have responded so quickly to my OP. I hope you will be able to point out the errors of my ways. The following is the code sequence for the generation of two sets of 1-million integers between 1 and 6, and their sum. Then, the subsequent code counts the number of sevens in a sample of 40,000 sums (divided by 40,000) to give decimal representation to be compared with ideal ratio of 1/6. I ran the latter code 20 times at arbitrary locations, and averaged those results. Sub OneMillionEvents() Range("A1:C1000000").Select Selection.ClearContents Selection.NumberFormat = "0" Range("A1").Select ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)" Range("B1").Select ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)" Range("C1").Select ActiveCell.FormulaR1C1 = "=RC[-2]+RC[-1]" Range("A1:C1000000").Select Selection.FillDown Calculate End Sub Sub CountSevens() Dim i, n As Integer Range("A1").Select ActiveCell(1, 5) = 40000 ActiveCell(1, 6) = 0 n = 0 For i = 825001 to 865000 If ActiveCell(i, 3) = 7 Then n = n + 1 End If Next i ActiveCell(1, 6) = n / ActiveCell(1, 5) End Sub Regards, Yttrias -- yttrias "Jerry W. Lewis" wrote: You are still providing too little information for us to understand what you are doing. There is no RAND() function in VBA. In VBA, you either used the VBA Rnd() function, or the worksheet RAND() function via Evaluate("RAND()"). As noted in my earlier post, in 2003 the worksheet RAND() function was far superior to the VBA Rnd(). I have seen no evidence that either function changed between 2003 and 2007. Your original post discussed the sum of two (presumably independent) random integers between 1 and 6 (such as the total from rolling two fair dice). The expected value of a single such sum would be 7. It is unclear how that relates to either your observed average of 1587 or expected average of 1667 for an "average of 20 windows of 40,000 consecutive values. It sounds as though your VBA would involve only a few lines of code. If you paste that code into your reply, we can see exactly what you did. Jerry "yttrias" wrote: I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000 consecutive values at 20 different locations throughout the range of 1M. The average of those 20 windows in the 1M was 1587, where the expected average should be 1667. -- yttrias "Bernard Liengme" wrote: And what was your sample size? -- Bernard V Liengme Microsoft Excel MVP www.stfx.ca/people/bliengme remove caps from email "yttrias" wrote in message ... I'm testing random numbers for games using excel 07 generator, but there is too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias |
Bias in rand for excel 07
"Rick Rothstein (MVP - VB)" wrote in message
First, I removed the "1" off of the Randomize statement in the second loop so that both Randomize statements draw from the Timer for their seed values. I had intended to post it that way, somehow forgot! Second, I think if you change the Const C assignment statement in the PaintPixel procedure to this... Const C As Long = 16711680 you get a much starker (and more amazing) contrast between the two displays Hmm, for me Const C As Long = 16777215 gives a very much better visual indication, not so much of the contrast between the two displays but very obvious vertical bands, even the first display (not random pixels at all). I wonder why, maybe Rnd() seeds or calculates slightly differently in our systems. I had been testing in W98SE and have now repeated in Vista. The first display visually looks identical in both systems, but the second is somewhat different; though in both systems the vertical banding is more obvious in both displays with the 16777215 multiplier. Maybe for some users neither value will produce anything obvious. Ah, so that is how you draw into a UserForm... ThunderDFrame... I'll have to remember that. and just in case need to cater for Excel97 the userform's classname is ThunderXFrame I will be using your coded translation (with appropriate acknowledgement to you, of course) It's all pretty generic stuff so no acknowledgement necessary, but appreciate the thought. The code was the bare minimum for the demo, better to get system's pointsPerPixel - Private Declare Function GetDeviceCaps Lib "gdi32" ( _ ByVal hdc As Long, ByVal nIndex As Long) As Long Private Const POINTS_PER_INCH As Long = 72 Private Const LOGPIXELSX As Long = 88 Dim mdPointsPerPixel As Double Private Sub GetPPI() Dim dpi As Long dpi = GetDeviceCaps(GetDC(0), LOGPIXELSX) mdPointsPerPixel = POINTS_PER_INCH / dpi ReleaseDC 0, hdc End Sub In the demos replace the 0.75 with mdPointsPerPixel. But somehow I think you already know all that. <g Perhaps use frm.Width - .Insidewidth & .Height -.Insideheight to get the inside coordinates, net of borders & caption instead of the approximations I used. There are of course other, possibly better/faster, methods to draw on a userform or any window than SetPixel. Regards, Peter T |
Bias in rand for excel 07
Thank you very much. With the simplified code you provided, I am able to
average the entire 1 million events in one run, which I couldn't do with the code I posted. (My system limitation.) With your code, tests of groups of 1 million repeatedly gave avg to within 0.12% of expected. There is more deviation, of course, when the samples are reduced to 40,000 points, but with enough repeated samples, I'm getting averages of samples to be pretty close to the expected value. This has been helpful to me. -- yttrias "Jerry W. Lewis" wrote: The code was very helpful. I agree that the expcted value in F1 should be 1/6=0.666... and that 0.1587 is a sigificant departure from expectation (the missing decimal points hindered my understanding). I am surprised that CountSevens() worked at all, since i is declared an integer, but takes values that are outside the valid range for an integer--that may have been the problem. There is a slight bias built into using either the INT() or TRUNC() functions. For example =((2-2^-52)-2) is <0, so =INT(2-2^-52) should be 1, but MS, in trying to hide the binary underpinnings, returns 2. Therefore there will be a slight reduction in the number of 6's and a slight increase in the number of 1's in your pseudo random integers, but these slight biases should not be statistically detectable in an experiment of your size. If you are concerned, you can avoid them by using an intermediate cell to hold the =RAND() values which you would then transform with the slightly more complicated formula =IF(((x*6)-INT(x*6))<0,INT(x*6)-1,INT(x*6)) Some recalculations of the random integers occur during the execution of CountSevens(). It is not clear to me whether these might be occurring during the execution of the For/Next loop; regardless, I don't see how it could cause an undercount of 7's. Here is an alternate VBA code, that simplifies and combines the work of both OneMillionEvents() and CountSevens(). It also replaces the static value in F1 with a dynamic formula that will update whenever the RAND() functions recalculate. I tried about a dozen reps in Excel 2007, and got only one rep that was statistically different from expectation. Sub SimplerVersion() With Range("A1:C1000000") .ClearContents .NumberFormat = "0" End With Range("A1:B1000000").FormulaR1C1 = "=1+INT(RAND()*6)" Range("C1:C1000000").FormulaR1C1 = "=RC[-2]+RC[-1]" Range("F1").FormulaArray = "=AVERAGE((R825001C3:R865000C3=7)*1)" End Sub Jerry "yttrias" wrote: I am grateful to those of you who have responded so quickly to my OP. I hope you will be able to point out the errors of my ways. The following is the code sequence for the generation of two sets of 1-million integers between 1 and 6, and their sum. Then, the subsequent code counts the number of sevens in a sample of 40,000 sums (divided by 40,000) to give decimal representation to be compared with ideal ratio of 1/6. I ran the latter code 20 times at arbitrary locations, and averaged those results. Sub OneMillionEvents() Range("A1:C1000000").Select Selection.ClearContents Selection.NumberFormat = "0" Range("A1").Select ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)" Range("B1").Select ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)" Range("C1").Select ActiveCell.FormulaR1C1 = "=RC[-2]+RC[-1]" Range("A1:C1000000").Select Selection.FillDown Calculate End Sub Sub CountSevens() Dim i, n As Integer Range("A1").Select ActiveCell(1, 5) = 40000 ActiveCell(1, 6) = 0 n = 0 For i = 825001 to 865000 If ActiveCell(i, 3) = 7 Then n = n + 1 End If Next i ActiveCell(1, 6) = n / ActiveCell(1, 5) End Sub Regards, Yttrias -- yttrias "Jerry W. Lewis" wrote: You are still providing too little information for us to understand what you are doing. There is no RAND() function in VBA. In VBA, you either used the VBA Rnd() function, or the worksheet RAND() function via Evaluate("RAND()"). As noted in my earlier post, in 2003 the worksheet RAND() function was far superior to the VBA Rnd(). I have seen no evidence that either function changed between 2003 and 2007. Your original post discussed the sum of two (presumably independent) random integers between 1 and 6 (such as the total from rolling two fair dice). The expected value of a single such sum would be 7. It is unclear how that relates to either your observed average of 1587 or expected average of 1667 for an "average of 20 windows of 40,000 consecutive values. It sounds as though your VBA would involve only a few lines of code. If you paste that code into your reply, we can see exactly what you did. Jerry "yttrias" wrote: I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000 consecutive values at 20 different locations throughout the range of 1M. The average of those 20 windows in the 1M was 1587, where the expected average should be 1667. -- yttrias "Bernard Liengme" wrote: And what was your sample size? -- Bernard V Liengme Microsoft Excel MVP www.stfx.ca/people/bliengme remove caps from email "yttrias" wrote in message ... I'm testing random numbers for games using excel 07 generator, but there is too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias |
Bias in rand for excel 07
Jerry,
Thanks again for your code suggestion. Although the array formula works fine, I don't understand the syntax. How does the "*1" function? Can I substitute a variable for the "7" delineator? (When I try to do that, I get error.) Do you know of any published reference to this, and other, helpful array formula expressions that are (for me) beyond the ordinary? -- yttrias "Jerry W. Lewis" wrote: The code was very helpful. I agree that the expcted value in F1 should be 1/6=0.666... and that 0.1587 is a sigificant departure from expectation (the missing decimal points hindered my understanding). I am surprised that CountSevens() worked at all, since i is declared an integer, but takes values that are outside the valid range for an integer--that may have been the problem. There is a slight bias built into using either the INT() or TRUNC() functions. For example =((2-2^-52)-2) is <0, so =INT(2-2^-52) should be 1, but MS, in trying to hide the binary underpinnings, returns 2. Therefore there will be a slight reduction in the number of 6's and a slight increase in the number of 1's in your pseudo random integers, but these slight biases should not be statistically detectable in an experiment of your size. If you are concerned, you can avoid them by using an intermediate cell to hold the =RAND() values which you would then transform with the slightly more complicated formula =IF(((x*6)-INT(x*6))<0,INT(x*6)-1,INT(x*6)) Some recalculations of the random integers occur during the execution of CountSevens(). It is not clear to me whether these might be occurring during the execution of the For/Next loop; regardless, I don't see how it could cause an undercount of 7's. Here is an alternate VBA code, that simplifies and combines the work of both OneMillionEvents() and CountSevens(). It also replaces the static value in F1 with a dynamic formula that will update whenever the RAND() functions recalculate. I tried about a dozen reps in Excel 2007, and got only one rep that was statistically different from expectation. Sub SimplerVersion() With Range("A1:C1000000") .ClearContents .NumberFormat = "0" End With Range("A1:B1000000").FormulaR1C1 = "=1+INT(RAND()*6)" Range("C1:C1000000").FormulaR1C1 = "=RC[-2]+RC[-1]" Range("F1").FormulaArray = "=AVERAGE((R825001C3:R865000C3=7)*1)" End Sub Jerry "yttrias" wrote: I am grateful to those of you who have responded so quickly to my OP. I hope you will be able to point out the errors of my ways. The following is the code sequence for the generation of two sets of 1-million integers between 1 and 6, and their sum. Then, the subsequent code counts the number of sevens in a sample of 40,000 sums (divided by 40,000) to give decimal representation to be compared with ideal ratio of 1/6. I ran the latter code 20 times at arbitrary locations, and averaged those results. Sub OneMillionEvents() Range("A1:C1000000").Select Selection.ClearContents Selection.NumberFormat = "0" Range("A1").Select ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)" Range("B1").Select ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)" Range("C1").Select ActiveCell.FormulaR1C1 = "=RC[-2]+RC[-1]" Range("A1:C1000000").Select Selection.FillDown Calculate End Sub Sub CountSevens() Dim i, n As Integer Range("A1").Select ActiveCell(1, 5) = 40000 ActiveCell(1, 6) = 0 n = 0 For i = 825001 to 865000 If ActiveCell(i, 3) = 7 Then n = n + 1 End If Next i ActiveCell(1, 6) = n / ActiveCell(1, 5) End Sub Regards, Yttrias -- yttrias "Jerry W. Lewis" wrote: You are still providing too little information for us to understand what you are doing. There is no RAND() function in VBA. In VBA, you either used the VBA Rnd() function, or the worksheet RAND() function via Evaluate("RAND()"). As noted in my earlier post, in 2003 the worksheet RAND() function was far superior to the VBA Rnd(). I have seen no evidence that either function changed between 2003 and 2007. Your original post discussed the sum of two (presumably independent) random integers between 1 and 6 (such as the total from rolling two fair dice). The expected value of a single such sum would be 7. It is unclear how that relates to either your observed average of 1587 or expected average of 1667 for an "average of 20 windows of 40,000 consecutive values. It sounds as though your VBA would involve only a few lines of code. If you paste that code into your reply, we can see exactly what you did. Jerry "yttrias" wrote: I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000 consecutive values at 20 different locations throughout the range of 1M. The average of those 20 windows in the 1M was 1587, where the expected average should be 1667. -- yttrias "Bernard Liengme" wrote: And what was your sample size? -- Bernard V Liengme Microsoft Excel MVP www.stfx.ca/people/bliengme remove caps from email "yttrias" wrote in message ... I'm testing random numbers for games using excel 07 generator, but there is too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias |
Bias in rand for excel 07
(R825001C3:R865000C3=7)
is an array of logical values, which AVERAGE would ignore. Multiplying by one coerces TRUE to 1 and FALSE to 0, so that you average an indicator variable of whether 7 occurred in a cell or not. This type of thing is probably discussed in some 'tips and tricks' books, but I don't have access to any and so can't give you a recommendation. Jerry "yttrias" wrote: Jerry, Thanks again for your code suggestion. Although the array formula works fine, I don't understand the syntax. How does the "*1" function? Can I substitute a variable for the "7" delineator? (When I try to do that, I get error.) Do you know of any published reference to this, and other, helpful array formula expressions that are (for me) beyond the ordinary? -- yttrias "Jerry W. Lewis" wrote: The code was very helpful. I agree that the expcted value in F1 should be 1/6=0.666... and that 0.1587 is a sigificant departure from expectation (the missing decimal points hindered my understanding). I am surprised that CountSevens() worked at all, since i is declared an integer, but takes values that are outside the valid range for an integer--that may have been the problem. There is a slight bias built into using either the INT() or TRUNC() functions. For example =((2-2^-52)-2) is <0, so =INT(2-2^-52) should be 1, but MS, in trying to hide the binary underpinnings, returns 2. Therefore there will be a slight reduction in the number of 6's and a slight increase in the number of 1's in your pseudo random integers, but these slight biases should not be statistically detectable in an experiment of your size. If you are concerned, you can avoid them by using an intermediate cell to hold the =RAND() values which you would then transform with the slightly more complicated formula =IF(((x*6)-INT(x*6))<0,INT(x*6)-1,INT(x*6)) Some recalculations of the random integers occur during the execution of CountSevens(). It is not clear to me whether these might be occurring during the execution of the For/Next loop; regardless, I don't see how it could cause an undercount of 7's. Here is an alternate VBA code, that simplifies and combines the work of both OneMillionEvents() and CountSevens(). It also replaces the static value in F1 with a dynamic formula that will update whenever the RAND() functions recalculate. I tried about a dozen reps in Excel 2007, and got only one rep that was statistically different from expectation. Sub SimplerVersion() With Range("A1:C1000000") .ClearContents .NumberFormat = "0" End With Range("A1:B1000000").FormulaR1C1 = "=1+INT(RAND()*6)" Range("C1:C1000000").FormulaR1C1 = "=RC[-2]+RC[-1]" Range("F1").FormulaArray = "=AVERAGE((R825001C3:R865000C3=7)*1)" End Sub Jerry "yttrias" wrote: I am grateful to those of you who have responded so quickly to my OP. I hope you will be able to point out the errors of my ways. The following is the code sequence for the generation of two sets of 1-million integers between 1 and 6, and their sum. Then, the subsequent code counts the number of sevens in a sample of 40,000 sums (divided by 40,000) to give decimal representation to be compared with ideal ratio of 1/6. I ran the latter code 20 times at arbitrary locations, and averaged those results. Sub OneMillionEvents() Range("A1:C1000000").Select Selection.ClearContents Selection.NumberFormat = "0" Range("A1").Select ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)" Range("B1").Select ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)" Range("C1").Select ActiveCell.FormulaR1C1 = "=RC[-2]+RC[-1]" Range("A1:C1000000").Select Selection.FillDown Calculate End Sub Sub CountSevens() Dim i, n As Integer Range("A1").Select ActiveCell(1, 5) = 40000 ActiveCell(1, 6) = 0 n = 0 For i = 825001 to 865000 If ActiveCell(i, 3) = 7 Then n = n + 1 End If Next i ActiveCell(1, 6) = n / ActiveCell(1, 5) End Sub Regards, Yttrias -- yttrias "Jerry W. Lewis" wrote: You are still providing too little information for us to understand what you are doing. There is no RAND() function in VBA. In VBA, you either used the VBA Rnd() function, or the worksheet RAND() function via Evaluate("RAND()"). As noted in my earlier post, in 2003 the worksheet RAND() function was far superior to the VBA Rnd(). I have seen no evidence that either function changed between 2003 and 2007. Your original post discussed the sum of two (presumably independent) random integers between 1 and 6 (such as the total from rolling two fair dice). The expected value of a single such sum would be 7. It is unclear how that relates to either your observed average of 1587 or expected average of 1667 for an "average of 20 windows of 40,000 consecutive values. It sounds as though your VBA would involve only a few lines of code. If you paste that code into your reply, we can see exactly what you did. Jerry "yttrias" wrote: I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000 consecutive values at 20 different locations throughout the range of 1M. The average of those 20 windows in the 1M was 1587, where the expected average should be 1667. -- yttrias "Bernard Liengme" wrote: And what was your sample size? -- Bernard V Liengme Microsoft Excel MVP www.stfx.ca/people/bliengme remove caps from email "yttrias" wrote in message ... I'm testing random numbers for games using excel 07 generator, but there is too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias |
Bias in rand for excel 07
I presume your question is dealing with this line of code..
Range("F1").FormulaArray = "=AVERAGE((R825001C3:R865000C3=7)*1)" Because the text string is being assigned to the FormulaArray property of the Range, it means that the text will be evaluated as a spreadsheet formula, so it must meet the rules for those type of functions. The *1 is multiplying the logical expression, which will be a TRUE or FALSE result, by a number in order to turn the TRUE or FALSE result into a number (which the AVERAGE function needs in order to do its calculation). Sometimes you see this conversion handled by a double unary affixed to the beginning of the logical expression (in essence, forcing the calculation by multiply by minus one times minus one.... Range("F1").FormulaArray = "=AVERAGE(--(R825001C3:R865000C3=7))" As for making the "7" a variable instead of a constant, something like this should work... Dim Num As Long Num = 7 Range("F1").FormulaArray = "=AVERAGE((R825001C3:R865000C3=" & CStr(Num) & ")*1)" Rick Thanks again for your code suggestion. Although the array formula works fine, I don't understand the syntax. How does the "*1" function? Can I substitute a variable for the "7" delineator? (When I try to do that, I get error.) |
Bias in rand for excel 07
Very helpful. Thank you.
-- yttrias "Rick Rothstein (MVP - VB)" wrote: I presume your question is dealing with this line of code.. Range("F1").FormulaArray = "=AVERAGE((R825001C3:R865000C3=7)*1)" Because the text string is being assigned to the FormulaArray property of the Range, it means that the text will be evaluated as a spreadsheet formula, so it must meet the rules for those type of functions. The *1 is multiplying the logical expression, which will be a TRUE or FALSE result, by a number in order to turn the TRUE or FALSE result into a number (which the AVERAGE function needs in order to do its calculation). Sometimes you see this conversion handled by a double unary affixed to the beginning of the logical expression (in essence, forcing the calculation by multiply by minus one times minus one.... Range("F1").FormulaArray = "=AVERAGE(--(R825001C3:R865000C3=7))" As for making the "7" a variable instead of a constant, something like this should work... Dim Num As Long Num = 7 Range("F1").FormulaArray = "=AVERAGE((R825001C3:R865000C3=" & CStr(Num) & ")*1)" Rick Thanks again for your code suggestion. Although the array formula works fine, I don't understand the syntax. How does the "*1" function? Can I substitute a variable for the "7" delineator? (When I try to do that, I get error.) |
Bias in rand for excel 07
Very helpful. Thank you.
-- yttrias "Jerry W. Lewis" wrote: (R825001C3:R865000C3=7) is an array of logical values, which AVERAGE would ignore. Multiplying by one coerces TRUE to 1 and FALSE to 0, so that you average an indicator variable of whether 7 occurred in a cell or not. This type of thing is probably discussed in some 'tips and tricks' books, but I don't have access to any and so can't give you a recommendation. Jerry "yttrias" wrote: Jerry, Thanks again for your code suggestion. Although the array formula works fine, I don't understand the syntax. How does the "*1" function? Can I substitute a variable for the "7" delineator? (When I try to do that, I get error.) Do you know of any published reference to this, and other, helpful array formula expressions that are (for me) beyond the ordinary? -- yttrias "Jerry W. Lewis" wrote: The code was very helpful. I agree that the expcted value in F1 should be 1/6=0.666... and that 0.1587 is a sigificant departure from expectation (the missing decimal points hindered my understanding). I am surprised that CountSevens() worked at all, since i is declared an integer, but takes values that are outside the valid range for an integer--that may have been the problem. There is a slight bias built into using either the INT() or TRUNC() functions. For example =((2-2^-52)-2) is <0, so =INT(2-2^-52) should be 1, but MS, in trying to hide the binary underpinnings, returns 2. Therefore there will be a slight reduction in the number of 6's and a slight increase in the number of 1's in your pseudo random integers, but these slight biases should not be statistically detectable in an experiment of your size. If you are concerned, you can avoid them by using an intermediate cell to hold the =RAND() values which you would then transform with the slightly more complicated formula =IF(((x*6)-INT(x*6))<0,INT(x*6)-1,INT(x*6)) Some recalculations of the random integers occur during the execution of CountSevens(). It is not clear to me whether these might be occurring during the execution of the For/Next loop; regardless, I don't see how it could cause an undercount of 7's. Here is an alternate VBA code, that simplifies and combines the work of both OneMillionEvents() and CountSevens(). It also replaces the static value in F1 with a dynamic formula that will update whenever the RAND() functions recalculate. I tried about a dozen reps in Excel 2007, and got only one rep that was statistically different from expectation. Sub SimplerVersion() With Range("A1:C1000000") .ClearContents .NumberFormat = "0" End With Range("A1:B1000000").FormulaR1C1 = "=1+INT(RAND()*6)" Range("C1:C1000000").FormulaR1C1 = "=RC[-2]+RC[-1]" Range("F1").FormulaArray = "=AVERAGE((R825001C3:R865000C3=7)*1)" End Sub Jerry "yttrias" wrote: I am grateful to those of you who have responded so quickly to my OP. I hope you will be able to point out the errors of my ways. The following is the code sequence for the generation of two sets of 1-million integers between 1 and 6, and their sum. Then, the subsequent code counts the number of sevens in a sample of 40,000 sums (divided by 40,000) to give decimal representation to be compared with ideal ratio of 1/6. I ran the latter code 20 times at arbitrary locations, and averaged those results. Sub OneMillionEvents() Range("A1:C1000000").Select Selection.ClearContents Selection.NumberFormat = "0" Range("A1").Select ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)" Range("B1").Select ActiveCell.FormulaR1C1 = "=1+int(RAND()*6)" Range("C1").Select ActiveCell.FormulaR1C1 = "=RC[-2]+RC[-1]" Range("A1:C1000000").Select Selection.FillDown Calculate End Sub Sub CountSevens() Dim i, n As Integer Range("A1").Select ActiveCell(1, 5) = 40000 ActiveCell(1, 6) = 0 n = 0 For i = 825001 to 865000 If ActiveCell(i, 3) = 7 Then n = n + 1 End If Next i ActiveCell(1, 6) = n / ActiveCell(1, 5) End Sub Regards, Yttrias -- yttrias "Jerry W. Lewis" wrote: You are still providing too little information for us to understand what you are doing. There is no RAND() function in VBA. In VBA, you either used the VBA Rnd() function, or the worksheet RAND() function via Evaluate("RAND()"). As noted in my earlier post, in 2003 the worksheet RAND() function was far superior to the VBA Rnd(). I have seen no evidence that either function changed between 2003 and 2007. Your original post discussed the sum of two (presumably independent) random integers between 1 and 6 (such as the total from rolling two fair dice). The expected value of a single such sum would be 7. It is unclear how that relates to either your observed average of 1587 or expected average of 1667 for an "average of 20 windows of 40,000 consecutive values. It sounds as though your VBA would involve only a few lines of code. If you paste that code into your reply, we can see exactly what you did. Jerry "yttrias" wrote: I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000 consecutive values at 20 different locations throughout the range of 1M. The average of those 20 windows in the 1M was 1587, where the expected average should be 1667. -- yttrias "Bernard Liengme" wrote: And what was your sample size? -- Bernard V Liengme Microsoft Excel MVP www.stfx.ca/people/bliengme remove caps from email "yttrias" wrote in message ... I'm testing random numbers for games using excel 07 generator, but there is too much bias away from statistical expectation. For example, when two integer numbers are generated between 1 and 6, and subsequently added, the results differ from the easily predictable numbers. Sevens are consistently in short supply by about 1.5%. That's a lot! Any suggestions for better randomization in excel would be very much appreciated. -- yttrias |
All times are GMT +1. The time now is 04:55 AM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com