Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Unexplainable Worksheet Function Calculation Process
I have gotten these symptoms to show up in both Excel 2003 and Excel
2000. The following downloadable sheet, has just the one user-defined function (in cell selected when opened). To make testing manageable, calculation is purposely set to "manual" and "not on save". After opening, pressing F9 (calculate key) takes you through 1981 passes of the function until the sheet thinks it is calculated. The function has a 2001-cell range argument. Calculation can be forced by editing/changing any single cell in the range D18 through AA18 As usual, during calculation you can interrupt the process by selecting a cell on the sheet before it is done. Checking the seconds per function pass dumped to the Immediate Window, you will be able to estimate roughly how long it will take to finish after 1981 passes (5 minutes on one of my machines, 10 on another). http://www.oz.net/~gurfler/Download/TestBugN.xls Once you have let the calculation complete, it never takes that long again. The following sheet is in that state, but is in no other respect different: http://www.oz.net/~gurfler/Download/TestBugOK.xls Diagnostics illustrating very strange behavior during each pass of the function are dumped to the Immediate Window in the following (just hit F9 as soon as it is loaded): http://www.oz.net/~gurfler/Download/TestBugDebug.xls These diagnostics indicate that on first pass the function only sees one cell in its range-argument as non-empty, next pass two cells, etc., more or less, until everything clears up after 1981 passes (why 20 short of 2001 ???). Making the range argument of the function into a worksheet array is a great way to prevent this horrible stall in calculation, though the diagnostic here does show that even then it still takes two passes to complete, first pass apparently all cells come through empty. http://www.oz.net/~gurfler/Download/...DebugArray.xls OK, no smart-alec comments: I know that this function merely duplicates what can be achieved through the following formula using exclusively excel's built in functions: {=AVERAGE(SMALL(AJ27:AJ2027,AG28:AG128))} (as an ArrayFormula) My purpose is only to test the functionality of user defined functions like this, and the example provided here shows some limitations, or at least the need for work-arounds in some instances. Does anybody know what is causing this behavior, and how to mitigate its effects? Is it really necessary/advisable to create arrays for all unser-defined function arguments, or can something else be done to help avoid this bizarre behavior. Thanks, Nick |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Unexplainable Worksheet Function Calculation Process
That's a wonderful example of the way Excel's recalculation process works
(this behaviour is by design). see http://www.decisionModels.com/Calcsecretsj.htm for an explanation and examples of how to make this UDF more efficient, and associated pages for an explanation of Excel's calculation process. The function below runs in about 0.6 of a second as compared to over 300 on your TestBugN example (note you also have to change your sort routine to handle a 2-d variant array). Function AverageUnderTailof(Distribution As Range, atEnd As Long, Number As Long) As Double Dim i As Long, StartofTail As Long, EnterTime As Double Dim vArr As Variant EnterTime = Timer If funcount = 0 Then StartTime = EnterTime vArr = Distribution.Value2 If AreAnyEmpty(vArr) Then Exit Function StartofTail = 1 + atEnd * (UBound(vArr) - Number) AverageUnderTailof = 0 With SortOrderof(vArr) For i = StartofTail To StartofTail + Number - 1 AverageUnderTailof = AverageUnderTailof + vArr(.Values(i), 1) Next End With AverageUnderTailof = AverageUnderTailof / Number funcount = funcount + 1 Debug.Print Format(funcount, "0"), Format(Timer - EnterTime, "0.0000"), _ Format(Timer - StartTime, "0.0000"), Format(AverageUnderTailof, "#,##0.00") End Function Function AreAnyEmpty(vArr As Variant) As Boolean Dim j As Long AreAnyEmpty = False If VarType(vArr) = 8192 Then For j = 1 To UBound(vArr) If IsEmpty(vArr(j, 1)) Then AreAnyEmpty = True Exit For End If Next j Else If IsEmpty(vArr) Then AreAnyEmpty = True End If End Function regards Charles ___________________________________ London Excel Users Conference April 1-2 The Excel Calculation Site http://www.decisionmodels.com "Nicholas Dreyer" wrote in message ... I have gotten these symptoms to show up in both Excel 2003 and Excel 2000. The following downloadable sheet, has just the one user-defined function (in cell selected when opened). To make testing manageable, calculation is purposely set to "manual" and "not on save". After opening, pressing F9 (calculate key) takes you through 1981 passes of the function until the sheet thinks it is calculated. The function has a 2001-cell range argument. Calculation can be forced by editing/changing any single cell in the range D18 through AA18 As usual, during calculation you can interrupt the process by selecting a cell on the sheet before it is done. Checking the seconds per function pass dumped to the Immediate Window, you will be able to estimate roughly how long it will take to finish after 1981 passes (5 minutes on one of my machines, 10 on another). http://www.oz.net/~gurfler/Download/TestBugN.xls Once you have let the calculation complete, it never takes that long again. The following sheet is in that state, but is in no other respect different: http://www.oz.net/~gurfler/Download/TestBugOK.xls Diagnostics illustrating very strange behavior during each pass of the function are dumped to the Immediate Window in the following (just hit F9 as soon as it is loaded): http://www.oz.net/~gurfler/Download/TestBugDebug.xls These diagnostics indicate that on first pass the function only sees one cell in its range-argument as non-empty, next pass two cells, etc., more or less, until everything clears up after 1981 passes (why 20 short of 2001 ???). Making the range argument of the function into a worksheet array is a great way to prevent this horrible stall in calculation, though the diagnostic here does show that even then it still takes two passes to complete, first pass apparently all cells come through empty. http://www.oz.net/~gurfler/Download/...DebugArray.xls OK, no smart-alec comments: I know that this function merely duplicates what can be achieved through the following formula using exclusively excel's built in functions: {=AVERAGE(SMALL(AJ27:AJ2027,AG28:AG128))} (as an ArrayFormula) My purpose is only to test the functionality of user defined functions like this, and the example provided here shows some limitations, or at least the need for work-arounds in some instances. Does anybody know what is causing this behavior, and how to mitigate its effects? Is it really necessary/advisable to create arrays for all unser-defined function arguments, or can something else be done to help avoid this bizarre behavior. Thanks, Nick |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Unexplainable Worksheet Function Calculation Process
Charles:
First off thanks for posting this extraordinarily thorough solution. Now I must exclaim, what a remarkable coincidence it is to see this great response of yours today! Earlier today I found your extremely informative web site which already has explained enough for me to see (sort of) what was going on, at least enough to be able to come up with the solution to just filter all cells of incoming ranges for this condition displayed on your site: ISEMPTY(Cell.Value) AND Len(Cell.formula)0. Your, no doubt better, solution below does raise a few questions for which answers, if you are in a position, and so inclined to provide them would help me greatly understand a few long-standing VBA puzzles. First, the most significant is invoked by your use of the statement vArr = Distribution.Value2 Given that you must be much more versed in the efficiencies of VBA coding than I, I must assume that using vArr - with it's awkward extra dimension - has advantages over directly accessing a range argument. If so, I am very curious what they would be. The way the assignment above is done in fact reminds me of a strange asymmetry I have never been able to comprehend: It is possible to, with extreme efficiency, assign very large two-dimensional VBA arrays to ranges by a simple assignment Range object = VBA ArrayVariable The above looks like it might be the way I have been looking for achieving the reverse. Does going through vArr improve the relatively sluggish VBA response time seen when accessing range values by referencing each individual cell? This would be quite a revelation to me if you can say this is true, as I have never been able to figure out how to get similar speeds loading massive range values into VBA variables as you do loading large arrays back into excel sheets. Second, I do not understand the difference between value2 and value. In particular, is this distinction crucial in the context of your solution? Thanks a-million Charles. I will be studying your site for some time to come, and look forward to any further insights you might be able to post here regarding my follow-up queries. Nick On Wed, 10 Dec 2008 11:44:30 -0000, "Charles Williams" wrote: That's a wonderful example of the way Excel's recalculation process works (this behaviour is by design). see http://www.decisionModels.com/Calcsecretsj.htm for an explanation and examples of how to make this UDF more efficient, and associated pages for an explanation of Excel's calculation process. The function below runs in about 0.6 of a second as compared to over 300 on your TestBugN example (note you also have to change your sort routine to handle a 2-d variant array). Function AverageUnderTailof(Distribution As Range, atEnd As Long, Number As Long) As Double Dim i As Long, StartofTail As Long, EnterTime As Double Dim vArr As Variant EnterTime = Timer If funcount = 0 Then StartTime = EnterTime vArr = Distribution.Value2 If AreAnyEmpty(vArr) Then Exit Function StartofTail = 1 + atEnd * (UBound(vArr) - Number) AverageUnderTailof = 0 With SortOrderof(vArr) For i = StartofTail To StartofTail + Number - 1 AverageUnderTailof = AverageUnderTailof + vArr(.Values(i), 1) Next End With AverageUnderTailof = AverageUnderTailof / Number funcount = funcount + 1 Debug.Print Format(funcount, "0"), Format(Timer - EnterTime, "0.0000"), _ Format(Timer - StartTime, "0.0000"), Format(AverageUnderTailof, "#,##0.00") End Function Function AreAnyEmpty(vArr As Variant) As Boolean Dim j As Long AreAnyEmpty = False If VarType(vArr) = 8192 Then For j = 1 To UBound(vArr) If IsEmpty(vArr(j, 1)) Then AreAnyEmpty = True Exit For End If Next j Else If IsEmpty(vArr) Then AreAnyEmpty = True End If End Function regards Charles ___________________________________ London Excel Users Conference April 1-2 The Excel Calculation Site http://www.decisionmodels.com |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
Unexplainable Worksheet Function Calculation Process
Charles:
Just comformed the efficiencies I suspected you can with your methods. I too see less than a second using your "AreAnyEmpty" function. compared to original troublesome 300 without. Your vArr assignment must be doing something important too, since it takes 16 seonds using this simple filtering loop at the top of my "AverageUnderTailof" function For i = 1 To Distribution.Cells.Count If IsEmpty(Distribution(i).Value) And _ Len(Distribution(i).Formula) 0 Then Exit Function Next Nick On Thu, 11 Dec 2008 04:26:16 GMT, (Nicholas Dreyer) wrote: Charles: First off thanks for posting this extraordinarily thorough solution. Now I must exclaim, what a remarkable coincidence it is to see this great response of yours today! Earlier today I found your extremely informative web site which already has explained enough for me to see (sort of) what was going on, at least enough to be able to come up with the solution to just filter all cells of incoming ranges for this condition displayed on your site: ISEMPTY(Cell.Value) AND Len(Cell.formula)0. Your, no doubt better, solution below does raise a few questions for which answers, if you are in a position, and so inclined to provide them would help me greatly understand a few long-standing VBA puzzles. First, the most significant is invoked by your use of the statement vArr = Distribution.Value2 Given that you must be much more versed in the efficiencies of VBA coding than I, I must assume that using vArr - with it's awkward extra dimension - has advantages over directly accessing a range argument. If so, I am very curious what they would be. The way the assignment above is done in fact reminds me of a strange asymmetry I have never been able to comprehend: It is possible to, with extreme efficiency, assign very large two-dimensional VBA arrays to ranges by a simple assignment Range object = VBA ArrayVariable The above looks like it might be the way I have been looking for achieving the reverse. Does going through vArr improve the relatively sluggish VBA response time seen when accessing range values by referencing each individual cell? This would be quite a revelation to me if you can say this is true, as I have never been able to figure out how to get similar speeds loading massive range values into VBA variables as you do loading large arrays back into excel sheets. Second, I do not understand the difference between value2 and value. In particular, is this distinction crucial in the context of your solution? Thanks a-million Charles. I will be studying your site for some time to come, and look forward to any further insights you might be able to post here regarding my follow-up queries. Nick On Wed, 10 Dec 2008 11:44:30 -0000, "Charles Williams" wrote: That's a wonderful example of the way Excel's recalculation process works (this behaviour is by design). see http://www.decisionModels.com/Calcsecretsj.htm for an explanation and examples of how to make this UDF more efficient, and associated pages for an explanation of Excel's calculation process. The function below runs in about 0.6 of a second as compared to over 300 on your TestBugN example (note you also have to change your sort routine to handle a 2-d variant array). Function AverageUnderTailof(Distribution As Range, atEnd As Long, Number As Long) As Double Dim i As Long, StartofTail As Long, EnterTime As Double Dim vArr As Variant EnterTime = Timer If funcount = 0 Then StartTime = EnterTime vArr = Distribution.Value2 If AreAnyEmpty(vArr) Then Exit Function StartofTail = 1 + atEnd * (UBound(vArr) - Number) AverageUnderTailof = 0 With SortOrderof(vArr) For i = StartofTail To StartofTail + Number - 1 AverageUnderTailof = AverageUnderTailof + vArr(.Values(i), 1) Next End With AverageUnderTailof = AverageUnderTailof / Number funcount = funcount + 1 Debug.Print Format(funcount, "0"), Format(Timer - EnterTime, "0.0000"), _ Format(Timer - StartTime, "0.0000"), Format(AverageUnderTailof, "#,##0.00") End Function Function AreAnyEmpty(vArr As Variant) As Boolean Dim j As Long AreAnyEmpty = False If VarType(vArr) = 8192 Then For j = 1 To UBound(vArr) If IsEmpty(vArr(j, 1)) Then AreAnyEmpty = True Exit For End If Next j Else If IsEmpty(vArr) Then AreAnyEmpty = True End If End Function regards Charles ___________________________________ London Excel Users Conference April 1-2 The Excel Calculation Site http://www.decisionmodels.com |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
Unexplainable Worksheet Function Calculation Process
Charles:
I don't quite get the "by design phrase" you use: Why does Excel even bother to start calculating any cell formula expression (such as the one containing my, or any user defined function) until it has finished calculating all precedents? Or is such an approach incompatible with the general calculation optimizing scheme used by excel? It certainly seem unintuitive, but them very little in optimization solutions ever is . . . Anyway any thought, however brief on this would be greatly appreciated, as it sure appeared to me (and I would imagine many others) to more of a design "flaw" than "feature". If the answer is on your site, I have not yet found it . . . Thanks again, Nick On Wed, 10 Dec 2008 11:44:30 -0000, "Charles Williams" wrote: That's a wonderful example of the way Excel's recalculation process works (this behaviour is by design). |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
Unexplainable Worksheet Function Calculation Process
Hi Nick,
To answer your questions: There is a significant overhead associated with initiating transferring stuff between Excel and VBA. I always think of VBA and Excel being separated by a very large hill. See Variant Benchmark on my downloads page to measure the relative size of the overhead on your system, and also http://www.decisionmodels.com/VersionCompare.htm So the larger the amount you transfer in each read or write (use the largest truck possible for the transport over the hill) the more efficient it is. There is also a small tunnel through the hill through which VBA can reach and manipulate things on the Excel side (tunnel=the object model). For instance if you want to use a worksheet function in your VBA (MATCH for example) its more efficient for VBA to reach through the tunnel and tell Excel what to do on the Excel side of the hill and only bring back the result of the MATCH, rather than transporting all the data over the hill. Using .VALUE2 is more efficient than using .VALUE (and usually safer as well) because it does not do the additional implicit conversion for currency (which may lose precision) and date formatted data. Unfortunately the default is .VALUE Why does Excel evaluate Cells containing references to uncalculated cells? I have not done an analysis, but I suspect that in general its more efficient when recalculating because the calculation chain will be well ordered so its not worth doing the extra work of checking. And there are probably cases where it needs to do this anyway (for instance a formula containing multiple components some of which reference calculated cells and some of which reference uncalculated cells, and maybe finding circular references). Charles ___________________________________ London Excel Users Conference April 1-2: book now not many places left! The Excel Calculation Site http://www.decisionmodels.com |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
Unexplainable Worksheet Function Calculation Process
Charles:
Well thanks once again. You have now not only helped me completely solve my original dilemma, but on the side revealed some wonderful additional solutions and insights, some of which I had long been seeking. I was already operating under the large truck analogy, but did not know what truck to use for one direction of large array transfers: Predimensioned arrays can be sent to excel ranges in one assignment statement, but the other direction just does not work. Until I saw your use of an undimensioned Variant for that purpose, I could only loop through all Range Cells to get their values into a VBA Array, i.e using the smallest truck possible. Sorry I won't be in London for your conference, but your help for me here in Seattle will be long remembered and passed on wherever I can. Nick On Thu, 11 Dec 2008 08:53:00 -0000, "Charles Williams" wrote: Hi Nick, To answer your questions: There is a significant overhead associated with initiating transferring stuff between Excel and VBA. I always think of VBA and Excel being separated by a very large hill. See Variant Benchmark on my downloads page to measure the relative size of the overhead on your system, and also http://www.decisionmodels.com/VersionCompare.htm So the larger the amount you transfer in each read or write (use the largest truck possible for the transport over the hill) the more efficient it is. There is also a small tunnel through the hill through which VBA can reach and manipulate things on the Excel side (tunnel=the object model). For instance if you want to use a worksheet function in your VBA (MATCH for example) its more efficient for VBA to reach through the tunnel and tell Excel what to do on the Excel side of the hill and only bring back the result of the MATCH, rather than transporting all the data over the hill. Using .VALUE2 is more efficient than using .VALUE (and usually safer as well) because it does not do the additional implicit conversion for currency (which may lose precision) and date formatted data. Unfortunately the default is .VALUE Why does Excel evaluate Cells containing references to uncalculated cells? I have not done an analysis, but I suspect that in general its more efficient when recalculating because the calculation chain will be well ordered so its not worth doing the extra work of checking. And there are probably cases where it needs to do this anyway (for instance a formula containing multiple components some of which reference calculated cells and some of which reference uncalculated cells, and maybe finding circular references). Charles ___________________________________ London Excel Users Conference April 1-2: book now not many places left! The Excel Calculation Site http://www.decisionmodels.com |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Match worksheet Function displays wrong data on calculation | Excel Worksheet Functions | |||
Stop an in-process subtotals calculation | Excel Worksheet Functions | |||
Unexplainable Formulas | Excel Discussion (Misc queries) | |||
Unexplainable results from my vlookup | Excel Worksheet Functions | |||
Automatic calculation of user-defined worksheet function | Excel Programming |