![]() |
Avoiding redundant calculations
I am using Excel 2003 on Windows Vista.
Mostly, I do little calculation in worksheets. I do everything with VBA in PERSONAL.XLS. However, I have a new working formula: <=INT(--(A2<A1)*VLOOKUP($F2,Stations,38,TRUE)+--(A2<A3)*VLOOKUP($G2,St ations,38,TRUE)) which is copied through to about 3000 rows. i.e. it does about 6000 VLOOKUP calls and takes a few minutes. --(A2<A1) and -(A2<A3) evaluate as 1 or 0. The VLOOKUP calls are unnecessary if those expressions result in 0. This applies to about 80% of the VLOOKUP calls. I would like to avoid that 80%. One thought is the IF function, but that does not do lazy evaluation. e.g. =IF(True,Truepart,Falsepart) calculates both Truepart and Falsepart. I could use a UDF, but prefer the simplicity of calculations in worksheets. I would value light from microsoft.public.excel.worksheet.functions. -- Walter Briscoe |
Avoiding redundant calculations
Hi Walter,
Am Sun, 18 Aug 2013 08:25:28 +0100 schrieb Walter Briscoe: <=INT(--(A2<A1)*VLOOKUP($F2,Stations,38,TRUE)+--(A2<A3)*VLOOKUP($G2,St ations,38,TRUE)) try: =INT(IF(A2<A1,VLOOKUP($F2,Stations,38,TRUE))+IF(A 2<A3,VLOOKUP($G2,Stations,38,TRUE))) Regards Claus B. -- Win XP PRof SP2 / Vista Ultimate SP2 Office 2003 SP2 /2007 Ultimate SP2 |
Avoiding redundant calculations
Hi Walter,
Am Sun, 18 Aug 2013 08:25:28 +0100 schrieb Walter Briscoe: One thought is the IF function, but that does not do lazy evaluation. e.g. =IF(True,Truepart,Falsepart) calculates both Truepart and Falsepart. start the Evaluate Formula dialog. Then you will see that the falsepart will not be calculated Regards Claus B. -- Win XP PRof SP2 / Vista Ultimate SP2 Office 2003 SP2 /2007 Ultimate SP2 |
Avoiding redundant calculations
"Walter Briscoe" wrote:
One thought is the IF function, but that does not do lazy evaluation. e.g. =IF(True,Truepart,Falsepart) calculates both Truepart and Falsepart. That would be correrct if IF() behaved like a real function. For example, the problem with AND(ISNUMBER(B1),B1*B2B3) is that AND() does indeed calculate all of its parameter, even if the a left-hand parameter is FALSE (e.g. B1 is text), making it unnecessary to calculate parameters to the right. However, IF() behaves like an operator whose syntax looks like a function. In this case, it calculates its parameters left-to-right only as needed. We rely on this "all the time". For example, IF(ISNUMBER(B1),IF(B1*B2B3,1,2),3) avoids the #VALUE we get with AND() when B1 is not numeric. Some other worksheet "functions" work the same way, notably CHOOSE(). CHOOSE() calculates only its first parameter and the parameter selected by it. |
Avoiding redundant calculations
In message of Sun, 18 Aug 2013 08:39:21 in
microsoft.public.excel.worksheet.functions, joeu2004 writes "Walter Briscoe" wrote: One thought is the IF function, but that does not do lazy evaluation. e.g. =IF(True,Truepart,Falsepart) calculates both Truepart and Falsepart. That would be correrct if IF() behaved like a real function. For example, the problem with AND(ISNUMBER(B1),B1*B2B3) is that AND() does indeed calculate all of its parameter, even if the a left-hand parameter is FALSE (e.g. B1 is text), making it unnecessary to calculate parameters to the right. However, IF() behaves like an operator whose syntax looks like a function. In this case, it calculates its parameters left-to-right only as needed. We rely on this "all the time". For example, IF(ISNUMBER(B 1),IF(B1*B2B3,1,2),3) avoids the #VALUE we get with AND() when B1 is not numeric. Some other worksheet "functions" work the same way, notably CHOOSE(). CHOOSE() calculates only its first parameter and the parameter selected by it. Thanks to both joeu2004 and Claus for your prompt replies. I was wrong and you are both right about the operation of IF. I was thinking of the VBA function IIF. I did step through the calculations of IF and IF(logical_test,value_if_true,value_if_false) value_if_true is only evaluated if logical_test is TRUE. This is different from the VBA function IIf(expr, truepart, falsepart). I changed my expression to refer to fixed columns. i.e. to <=INT(--($A2<$A1)*VLOOKUP($F2,Stations,38,TRUE)+--($A2<$A3)*VLOOKUP($G2,Stations,38,TRUE)) I timed the copy through at 13 minutes and 24 seconds. I put Claus's idea in the next column and copied through. <=INT(IF($A2<$A1,VLOOKUP($F2,Stations,38,TRUE))+I F($A2<$A3,VLOOKUP($G2,Stations,38,TRUE))) I was surprised the 2 formulae copied through in the same time. ;) I am surprised. I tried =countif(N2:N2899,0), but Excel seemed not to like countif on furmulae resulting in 0. I copied the data and pasted the values and found 1942 of the values were 0. i.e. I would expect a saving of at least 70%. Plainly I don't understand what is going on. ;( -- Walter Briscoe |
Avoiding redundant calculations
"Walter Briscoe" wrote:
[editing out the extraneous "<" and "" characters that are misleading and confusing.] I changed my expression to refer to fixed columns. i.e. to =INT(--($A2<$A1)*VLOOKUP($F2,Stations,38,TRUE) +--($A2<$A3)*VLOOKUP($G2,Stations,38,TRUE)) I timed the copy through at 13 minutes and 24 seconds. I put Claus's idea in the next column and copied through. =INT(IF($A2<$A1,VLOOKUP($F2,Stations,38,TRUE)) +IF($A2<$A3,VLOOKUP($G2,Stations,38,TRUE))) I was surprised the 2 formulae copied through in the same time. [....] Plainly I don't understand what is going on. I agree. FYI, on my old (read: slow) single-CPU laptop, 3000 cells with Claus's formula calculates in under 0.04 seconds, even when the lookup column comprises of 1,048,576 rows, the max in Excel 2007 and later, and all conditional expressions are TRUE (i.e. 6000 VLOOKUPs). Of course, that's with constants in all cells; and those are the only values and formulas in the (new) workbook. PS: If you are using Excel 2003 and you have a COUNTIF formula that references the column into which you are copying the VLOOKUP formulas, the COUNTIF might be causing the large recalculation time. IIRC, I found that to be the case in Excel 2003 per se, at least sometimes. (I had difficulty duplicating the behavior.) The problem seemed to be fixed in Excel 2007 and later. "Walter Briscoe" wrote: I tried =countif(N2:N2899,0), but Excel seemed not to like countif on furmulae resulting in 0. I don't know "does not like" means. Anyway, in general, COUNTIF has no problem with counting formulas that result in zero. |
Avoiding redundant calculations
In message of Mon, 19 Aug 2013 08:14:51 in
microsoft.public.excel.worksheet.functions, joeu2004 writes "Walter Briscoe" wrote: [editing out the extraneous "<" and "" characters that are misleading and confusing.] I changed my expression to refer to fixed columns. i.e. to =INT(--($A2<$A1)*VLOOKUP($F2,Stations,38,TRUE) +--($A2<$A3)*VLOOKUP($G2,Stations,38,TRUE)) I timed the copy through at 13 minutes and 24 seconds. I put Claus's idea in the next column and copied through. =INT(IF($A2<$A1,VLOOKUP($F2,Stations,38,TRUE)) +IF($A2<$A3,VLOOKUP($G2,Stations,38,TRUE))) I was surprised the 2 formulae copied through in the same time. [....] Plainly I don't understand what is going on. I agree. FYI, on my old (read: slow) single-CPU laptop, 3000 cells with Claus's formula calculates in under 0.04 seconds, even when the lookup column comprises of 1,048,576 rows, the max in Excel 2007 and later, and all conditional expressions are TRUE (i.e. 6000 VLOOKUPs). Of course, that's with constants in all cells; and those are the only values and formulas in the (new) workbook. stations is a reference to a worksheet in a closed workbook. Stations refers to =[stations.xls]Sheet1!$A$1:$CB$427. I can't remember how [stations.xls] references C:\Users\IBM\AppData\Roaming\Microsoft\Excel\stati ons.xls If I open that workbook, my 13:25 becomes ~0:10 How do you measure 0.04 seconds? I am using my wrist watch. PS: If you are using Excel 2003 and you have a COUNTIF formula that references the column into which you are copying the VLOOKUP formulas, the COUNTIF might be causing the large recalculation time. IIRC, I found that to be the case in Excel 2003 per se, at least sometimes. (I had difficulty duplicating the behavior.) The problem seemed to be fixed in Excel 2007 and later. The COUNTIF was done to count the number of cells in which no VLOOKUP call is actually made. "Walter Briscoe" wrote: I tried =countif(N2:N2899,0), but Excel seemed not to like countif on furmulae resulting in 0. I don't know "does not like" means. Anyway, in general, COUNTIF has no problem with counting formulas that result in zero. -- Walter Briscoe |
Avoiding redundant calculations
"Walter Briscoe" wrote:
joeu2004 writes FYI, on my old (read: slow) single-CPU laptop, 3000 cells with Claus's formula calculates in under 0.04 seconds, even when the lookup column comprises of 1,048,576 rows, the max in Excel 2007 and later, and all conditional expressions are TRUE (i.e. 6000 VLOOKUPs). Clarifications.... That is 0.040 seconds (really 0.037 sec) to recalculate 3000 formulas already entered into cells. It takes about 0.090 sec to copy-and-paste. I'm not sure if you are measuring copy-and-paste time or simply recalculation time. I will use recalculation times below Both numbers are with ScreenUpdating=True (normal mode). The times are 0.26 sec and 0.70 sec respectively with ScreenUpdating=False. That is useful only if you want to put the operation into a VBA macro. It might make only a small (imperceptible?) difference in your case. I will leave ScreenUpdating=True for my times below. Also, I should note that all times are for a worst-case (longest) lookup. But I'm not sure that makes a significant difference since with VLOOKUP(...,TRUE), there are at most 20 (or 21?) comparisons when the lookup column comprises 1,048,576 rows. In any case, since you are using Excel 2003 and you indicate that the lookup data comprises only 450 rows, I will use that scenario, since the file size might have significant time in external file references. Finally, I reiterate that YMMV depending on the complexity of the workbook besides the formulas and data you tell us about, as well as on your computer's characteristics. "Walter Briscoe" wrote: stations is a reference to a worksheet in a closed workbook. Stations refers to =[stations.xls]Sheet1!$A$1:$CB$427. I can't remember how [stations.xls] references C:\Users\IBM\AppData\Roaming\Microsoft\Excel\stati ons.xls That is quite a significant "little detail" to omit! That makes a __huge__ difference in the VLOOKUP time, especially if stations.xls is closed. (Not recommended.) If stations.xls is closed in the VLOOKUP Excel instance, 3000 statements (6000 VLOOKUPs) takes 31 to 33 sec. When stations.xls is open in the VLOOKUP Excel instance, it takes about 0.033 sec. So the first major performance improvement would be to ensure that stations.xls is open in the same Excel instance. Note: My "data.xls" file contains only A1:CB450 data (contants). If your stations.xls has a lot of other data, even if they are constants, that might significantly affect the file size, which might increase the close-file VLOOKUP reference time significantly. That might explain why my "long" time is about 33 sec, whereas yours is about 13 min. "Walter Briscoe" wrote: If I open that workbook, my 13:25 becomes ~0:10 Ding! Much ado about nothing. You had already come to the right conclusion yourself. FYI, that explains why the reference is =[stations.xls]Sheet1!$A$1:$CB$427 instead of =[C:\Users\IBM\AppData\Roaming\Microsoft\Excel\stati ons.xls]Sheet1!$A$1:$CB$427. Excel automagically adjusts the reference when you open the file in the same Excel instance. "Walter Briscoe" wrote: How do you measure 0.04 seconds? I am using my wrist watch. For long-ish times, i.e. a lot more than 15.625 milliseconds, you could do something like the following: Sub doit() Dim t As Double Dim st As Single, et As Single ' type of Timer st = Timer Range("b1:b3000").Dirty ' range with VLOOKUP formulas et = Timer t = et - st ' seconds MsgBox Format(t, "0.000000") End Sub Caveats.... The elapsed time might be off by as much as almost 15.625 msec; 0.015625 sec. Performance measurement and interpretation is part art as well as science. For example, the time for Range("b1").Dirty is misleading because it includes significant overhead due to "communciation" between Excel and VBA. The time for Range("b1:b3000").Dirty overwhelms that overhead, so the overhead becomes insignificant (we hope). However, it opens the door for a lot of extraneous computer activity (other events). On the other hand, arguably that is realistic. It depends on what exactly we want to measure. Another issue is whether or not to disable ScreenUpdating and Automatic recalculation. Again, it depends on what exactly we want to measure. In this case, I purposely keep those features enabled in order to mimic the "real world" situation. But if we did disable Automatic recalculation, we must use Range.Calculate instead of Range.Dirty. Either way can result in significant variability of time measurements. Moreover, I have found that there is signficant VBA overhead the first time a code path is executed after any change. And some overhead arises each time a macro is executed. So I routinely ignore the first execution of a code path. For "serious" measurements, I collect a "sufficient" sampling for statistical analysis. Finally, usually I use a timer with greater resolution. Besides avoiding the off-by-16-msec anomaly, it is useful for measuring small intervals (if we can control overhead). Here are the macros that I use. Option Explicit Public Declare Sub Sleep Lib "kernel32" (ByVal msec As Long) Public Declare Function QueryPerformanceFrequency Lib "kernel32" _ (ByRef freq As Currency) As Long Public Declare Function QueryPerformanceCounter Lib "kernel32" _ (ByRef cnt As Currency) As Long Private freq As Currency, df As Double Function myTimer() As Currency QueryPerformanceCounter myTimer End Function Function myElapsedTime(ByVal dt As Currency) As Double If freq = 0 Then QueryPerformanceFrequency freq: df = freq myElapsedTime = dt / df End Function Usage: Dim t as Double Dim st As Currency, et As Currency st = myTimer ' ... activity to measure ... et = myTimer t = myElapsedTime(et - st) ' seconds |
All times are GMT +1. The time now is 09:55 AM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com