ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Getting around Dependency Tree Performance Problems (https://www.excelbanter.com/excel-programming/422158-getting-around-dependency-tree-performance-problems.html)

AllSensibleNamesTaken

Getting around Dependency Tree Performance Problems
 
Hello all,
I'm looking for tips on how to avoid the seemingly intermitent, yet cripling
performance bottleneck that occurs when calling User Defined Functions that
refer to large ranges containing formulas.

Let me describe a simple setup that triggers the performance problem I am
seeing.

PROBLEM SETUP
----------------
On a new blank spreadsheet, setup a column containing 14,000 static (ie. not
formulas) numbers. Let's call this the Static column.

Setup another 14,000 row column next to the Static column. Let's call this
the Formula column. Each cell in the Formula column should manipulate the
equivalent cell in the Static column using a formula (subtracting a number
from the number in the static column should suffice).

enter a call to the UDF "doit" on another cell, passing it the entire
Formula column as the input argument range. An implementation of doit is
found below

Public Function doit(theRange As Range) As Long
'Doing something with theRange seems to trigger the performance
'bottleneck
Dim dummy as Integer
If IsEmpty(theRange) Then dummy = 0

'Count number of times this function is called
static numberOfCalls as Long
numberOfCalls = numberOfCals+1

doit=numberOfCalls
End Function
--------------

PROBLEM OBSERVATIONS
--------------------------
1)
The performance hit is due to the doit function intermitently being called
14,000 times by excel (once for each cell in the input argument range)
This happens most often though not always when requesting a Full recalc with
dependency tree rebuild (Ctrl+Alt+Shift+F9) though it also happens, albeit
less frequently with all other calculate requests, namely:
Full Recalc (Ctrl+Alt+F9)
Recalc (F9)
Sheet Recalc (Shift+F9)

2)
Note that this DOES NOT seem to happen if instead of the Formula Column, we
pass the Static column to the doit function

3)
Note also that this does not seem to happen if the column size is small

4)
In a more complex spreadsheet with several "doit" calls referring to various
large formula ranges, this intermitent 14,000 times call repeat problem
occurs much more often as it seems to affect each doit cell at different
times. The more doIt cells there are the more chances for the problem to
occur. INterestingly though I have not yet seen an ocasion when it's affected
more than one doit cell at a time.


PROBLEM ANALYSIS
----------------------
I am experiencing this using Excel 2003 and my thoughts on what is happening
based on the observations above are the following:

1)
The fact that the call is repeated once for each cell in the passed range
(14,000 times) and that the problem occurs most often when requesting a Full
Recalc with dependency tree rebuild leads me to beleive that the problem is
caused by the dependency tree logic

2)
The fact that the problem only manifest itself when the passed range is made
up of formulas which themselves depend on other cells, also points to a
dependency tree problem as clearly this setup has a higher level of
dependencies than when the passed range is just the static data column

3)
The fact that the problem does not manifest itself for small ranges points
to a possible Excel 2003 dependency limit breach causing Excel to always do a
full recalc rather than store a large dependency tree. However, why is the
behaviour intermitent? And why is the limit breached for such a small
spreadsheet. (Read more on limits on these links)
http://blogs.msdn.com/excel/archive/...26/474258.aspx
http://www.decisionmodels.com/calcsecretsf.htm

5)
I am also surprised that the problem does not occur unless the UDF mentions
the passed range in the body of the code. ie the range being in the argment
list on its own does not cause the problem. It's only when you try t use the
argument in the body


WHAT NEXT?
--------------

I would be very greatful if anyone with experience in building large
spreadsheets or with experience of this particular problem could share how
they've been getting around this issue

Thanks so much


Charles Williams

Getting around Dependency Tree Performance Problems
 
There are several things here that can cause poor performance.

1. Bug in Excel VBA that causes the VBE title bar to be refreshed each time
a VBA UDF is executed, unless the calculation is initated from VBA: bypass
is to run in manual calc mode and trap all calculation key sequences (F9,
Shift/F9, Ctrl/alt/F9 etc) to use a VBA Application.Calculate,
Activesheet.calculate, application.calculateFull etc command.
This will dramatically speed up each execution of the UDF.

2. VBA UDFs are called by Excel even when their arguments are uncalculated.
This can cause the UDF to be repeatedly called as you have discovered.
Uncalculated cells appear to VBA as Empty cells so for single cells you can
use an IsEmpty() test. For a Range it is not very efficient to loop through
every cell testing for IsEmpty so its faster to have code like this

Public Function tester1(theRange As Range)
On Error GoTo Fail
If Application.WorksheetFunction.CountA(theRange) < theRange.Rows.Count
Then Exit Function
lCount = lCount + 1
tester1 = lCount
Exit Function
Fail:
End Function

Alternatively you can change the formula to something like
=Counta(B1:B14000)*0+Tester1(b1:b14000)

3. This problem only occurs when the the calculation sequence has not been
optimised by previously being calculated. Thus for the second of two
successive calculations the UDF will usually only be called once.
Note modifying a formula flags it as uncalculated and puts it at in first
position in the calculation chain (LIFO).

Using these techniques will reduce the worst-case timing for the 14000 cell
range on my system from 65 seconds to 2 seconds. The second full calculation
will take milliseconds.

See http://www.DecisionModels.calcsecretsj.htm and for a discussion of UDFs
and http://www.decisionmodels.com/calcsecretsc.htm and associated pages for
a discussion of the calculation sequnce and dependency trees.


Charles
___________________________________
London Excel Users Conference April 1-2
The Excel Calculation Site
http://www.decisionmodels.com

"AllSensibleNamesTaken"
wrote in message ...
Hello all,
I'm looking for tips on how to avoid the seemingly intermitent, yet
cripling
performance bottleneck that occurs when calling User Defined Functions
that
refer to large ranges containing formulas.

Let me describe a simple setup that triggers the performance problem I am
seeing.

PROBLEM SETUP
----------------
On a new blank spreadsheet, setup a column containing 14,000 static (ie.
not
formulas) numbers. Let's call this the Static column.

Setup another 14,000 row column next to the Static column. Let's call this
the Formula column. Each cell in the Formula column should manipulate the
equivalent cell in the Static column using a formula (subtracting a number
from the number in the static column should suffice).

enter a call to the UDF "doit" on another cell, passing it the entire
Formula column as the input argument range. An implementation of doit is
found below

Public Function doit(theRange As Range) As Long
'Doing something with theRange seems to trigger the performance
'bottleneck
Dim dummy as Integer
If IsEmpty(theRange) Then dummy = 0

'Count number of times this function is called
static numberOfCalls as Long
numberOfCalls = numberOfCals+1

doit=numberOfCalls
End Function
--------------

PROBLEM OBSERVATIONS
--------------------------
1)
The performance hit is due to the doit function intermitently being called
14,000 times by excel (once for each cell in the input argument range)
This happens most often though not always when requesting a Full recalc
with
dependency tree rebuild (Ctrl+Alt+Shift+F9) though it also happens, albeit
less frequently with all other calculate requests, namely:
Full Recalc (Ctrl+Alt+F9)
Recalc (F9)
Sheet Recalc (Shift+F9)

2)
Note that this DOES NOT seem to happen if instead of the Formula Column,
we
pass the Static column to the doit function

3)
Note also that this does not seem to happen if the column size is small

4)
In a more complex spreadsheet with several "doit" calls referring to
various
large formula ranges, this intermitent 14,000 times call repeat problem
occurs much more often as it seems to affect each doit cell at different
times. The more doIt cells there are the more chances for the problem to
occur. INterestingly though I have not yet seen an ocasion when it's
affected
more than one doit cell at a time.


PROBLEM ANALYSIS
----------------------
I am experiencing this using Excel 2003 and my thoughts on what is
happening
based on the observations above are the following:

1)
The fact that the call is repeated once for each cell in the passed range
(14,000 times) and that the problem occurs most often when requesting a
Full
Recalc with dependency tree rebuild leads me to beleive that the problem
is
caused by the dependency tree logic

2)
The fact that the problem only manifest itself when the passed range is
made
up of formulas which themselves depend on other cells, also points to a
dependency tree problem as clearly this setup has a higher level of
dependencies than when the passed range is just the static data column

3)
The fact that the problem does not manifest itself for small ranges points
to a possible Excel 2003 dependency limit breach causing Excel to always
do a
full recalc rather than store a large dependency tree. However, why is the
behaviour intermitent? And why is the limit breached for such a small
spreadsheet. (Read more on limits on these links)
http://blogs.msdn.com/excel/archive/...26/474258.aspx
http://www.decisionmodels.com/calcsecretsf.htm

5)
I am also surprised that the problem does not occur unless the UDF
mentions
the passed range in the body of the code. ie the range being in the
argment
list on its own does not cause the problem. It's only when you try t use
the
argument in the body


WHAT NEXT?
--------------

I would be very greatful if anyone with experience in building large
spreadsheets or with experience of this particular problem could share how
they've been getting around this issue

Thanks so much





Charles Williams

Getting around Dependency Tree Performance Problems
 
Actually I have just discovered that using theRange.Rows.Count seems to work
much faster:

Option Explicit
Dim lCount As Long
Public Function tester1(theRange As Range)
Dim n As Long
On Error GoTo Fail
n = theRange.Rows.Count

lCount = lCount + 1
tester1 = lCount
Exit Function
Fail:
End Function


Charles
___________________________________
London Excel Users Conference April 1-2
The Excel Calculation Site
http://www.decisionmodels.com

"Charles Williams" wrote in message
...
There are several things here that can cause poor performance.

1. Bug in Excel VBA that causes the VBE title bar to be refreshed each
time a VBA UDF is executed, unless the calculation is initated from VBA:
bypass is to run in manual calc mode and trap all calculation key
sequences (F9, Shift/F9, Ctrl/alt/F9 etc) to use a VBA
Application.Calculate, Activesheet.calculate, application.calculateFull
etc command.
This will dramatically speed up each execution of the UDF.

2. VBA UDFs are called by Excel even when their arguments are
uncalculated. This can cause the UDF to be repeatedly called as you have
discovered. Uncalculated cells appear to VBA as Empty cells so for single
cells you can use an IsEmpty() test. For a Range it is not very efficient
to loop through every cell testing for IsEmpty so its faster to have code
like this

Public Function tester1(theRange As Range)
On Error GoTo Fail
If Application.WorksheetFunction.CountA(theRange) < theRange.Rows.Count
Then Exit Function
lCount = lCount + 1
tester1 = lCount
Exit Function
Fail:
End Function

Alternatively you can change the formula to something like
=Counta(B1:B14000)*0+Tester1(b1:b14000)

3. This problem only occurs when the the calculation sequence has not been
optimised by previously being calculated. Thus for the second of two
successive calculations the UDF will usually only be called once.
Note modifying a formula flags it as uncalculated and puts it at in first
position in the calculation chain (LIFO).

Using these techniques will reduce the worst-case timing for the 14000
cell range on my system from 65 seconds to 2 seconds. The second full
calculation will take milliseconds.

See http://www.DecisionModels.calcsecretsj.htm and for a discussion of
UDFs
and http://www.decisionmodels.com/calcsecretsc.htm and associated pages
for a discussion of the calculation sequnce and dependency trees.


Charles
___________________________________
London Excel Users Conference April 1-2
The Excel Calculation Site
http://www.decisionmodels.com

"AllSensibleNamesTaken"
wrote in message
...
Hello all,
I'm looking for tips on how to avoid the seemingly intermitent, yet
cripling
performance bottleneck that occurs when calling User Defined Functions
that
refer to large ranges containing formulas.

Let me describe a simple setup that triggers the performance problem I am
seeing.

PROBLEM SETUP
----------------
On a new blank spreadsheet, setup a column containing 14,000 static (ie.
not
formulas) numbers. Let's call this the Static column.

Setup another 14,000 row column next to the Static column. Let's call
this
the Formula column. Each cell in the Formula column should manipulate the
equivalent cell in the Static column using a formula (subtracting a
number
from the number in the static column should suffice).

enter a call to the UDF "doit" on another cell, passing it the entire
Formula column as the input argument range. An implementation of doit is
found below

Public Function doit(theRange As Range) As Long
'Doing something with theRange seems to trigger the performance
'bottleneck
Dim dummy as Integer
If IsEmpty(theRange) Then dummy = 0

'Count number of times this function is called
static numberOfCalls as Long
numberOfCalls = numberOfCals+1

doit=numberOfCalls
End Function
--------------

PROBLEM OBSERVATIONS
--------------------------
1)
The performance hit is due to the doit function intermitently being
called
14,000 times by excel (once for each cell in the input argument range)
This happens most often though not always when requesting a Full recalc
with
dependency tree rebuild (Ctrl+Alt+Shift+F9) though it also happens,
albeit
less frequently with all other calculate requests, namely:
Full Recalc (Ctrl+Alt+F9)
Recalc (F9)
Sheet Recalc (Shift+F9)

2)
Note that this DOES NOT seem to happen if instead of the Formula Column,
we
pass the Static column to the doit function

3)
Note also that this does not seem to happen if the column size is small

4)
In a more complex spreadsheet with several "doit" calls referring to
various
large formula ranges, this intermitent 14,000 times call repeat problem
occurs much more often as it seems to affect each doit cell at different
times. The more doIt cells there are the more chances for the problem to
occur. INterestingly though I have not yet seen an ocasion when it's
affected
more than one doit cell at a time.


PROBLEM ANALYSIS
----------------------
I am experiencing this using Excel 2003 and my thoughts on what is
happening
based on the observations above are the following:

1)
The fact that the call is repeated once for each cell in the passed range
(14,000 times) and that the problem occurs most often when requesting a
Full
Recalc with dependency tree rebuild leads me to beleive that the problem
is
caused by the dependency tree logic

2)
The fact that the problem only manifest itself when the passed range is
made
up of formulas which themselves depend on other cells, also points to a
dependency tree problem as clearly this setup has a higher level of
dependencies than when the passed range is just the static data column

3)
The fact that the problem does not manifest itself for small ranges
points
to a possible Excel 2003 dependency limit breach causing Excel to always
do a
full recalc rather than store a large dependency tree. However, why is
the
behaviour intermitent? And why is the limit breached for such a small
spreadsheet. (Read more on limits on these links)
http://blogs.msdn.com/excel/archive/...26/474258.aspx
http://www.decisionmodels.com/calcsecretsf.htm

5)
I am also surprised that the problem does not occur unless the UDF
mentions
the passed range in the body of the code. ie the range being in the
argment
list on its own does not cause the problem. It's only when you try t use
the
argument in the body


WHAT NEXT?
--------------

I would be very greatful if anyone with experience in building large
spreadsheets or with experience of this particular problem could share
how
they've been getting around this issue

Thanks so much








Charles Williams

Getting around Dependency Tree Performance Problems
 
Ooops ...
That does not work there is no proper dependency on the Range...

Please ignore.
Charles
___________________________________
London Excel Users Conference April 1-2
The Excel Calculation Site
http://www.decisionmodels.com

"Charles Williams" wrote in message
...
Actually I have just discovered that using theRange.Rows.Count seems to
work much faster:

Option Explicit
Dim lCount As Long
Public Function tester1(theRange As Range)
Dim n As Long
On Error GoTo Fail
n = theRange.Rows.Count

lCount = lCount + 1
tester1 = lCount
Exit Function
Fail:
End Function


Charles
___________________________________
London Excel Users Conference April 1-2
The Excel Calculation Site
http://www.decisionmodels.com

"Charles Williams" wrote in message
...
There are several things here that can cause poor performance.

1. Bug in Excel VBA that causes the VBE title bar to be refreshed each
time a VBA UDF is executed, unless the calculation is initated from VBA:
bypass is to run in manual calc mode and trap all calculation key
sequences (F9, Shift/F9, Ctrl/alt/F9 etc) to use a VBA
Application.Calculate, Activesheet.calculate, application.calculateFull
etc command.
This will dramatically speed up each execution of the UDF.

2. VBA UDFs are called by Excel even when their arguments are
uncalculated. This can cause the UDF to be repeatedly called as you have
discovered. Uncalculated cells appear to VBA as Empty cells so for single
cells you can use an IsEmpty() test. For a Range it is not very efficient
to loop through every cell testing for IsEmpty so its faster to have code
like this

Public Function tester1(theRange As Range)
On Error GoTo Fail
If Application.WorksheetFunction.CountA(theRange) < theRange.Rows.Count
Then Exit Function
lCount = lCount + 1
tester1 = lCount
Exit Function
Fail:
End Function

Alternatively you can change the formula to something like
=Counta(B1:B14000)*0+Tester1(b1:b14000)

3. This problem only occurs when the the calculation sequence has not
been optimised by previously being calculated. Thus for the second of two
successive calculations the UDF will usually only be called once.
Note modifying a formula flags it as uncalculated and puts it at in first
position in the calculation chain (LIFO).

Using these techniques will reduce the worst-case timing for the 14000
cell range on my system from 65 seconds to 2 seconds. The second full
calculation will take milliseconds.

See http://www.DecisionModels.calcsecretsj.htm and for a discussion of
UDFs
and http://www.decisionmodels.com/calcsecretsc.htm and associated pages
for a discussion of the calculation sequnce and dependency trees.


Charles
___________________________________
London Excel Users Conference April 1-2
The Excel Calculation Site
http://www.decisionmodels.com

"AllSensibleNamesTaken"
wrote in message
...
Hello all,
I'm looking for tips on how to avoid the seemingly intermitent, yet
cripling
performance bottleneck that occurs when calling User Defined Functions
that
refer to large ranges containing formulas.

Let me describe a simple setup that triggers the performance problem I
am
seeing.

PROBLEM SETUP
----------------
On a new blank spreadsheet, setup a column containing 14,000 static (ie.
not
formulas) numbers. Let's call this the Static column.

Setup another 14,000 row column next to the Static column. Let's call
this
the Formula column. Each cell in the Formula column should manipulate
the
equivalent cell in the Static column using a formula (subtracting a
number
from the number in the static column should suffice).

enter a call to the UDF "doit" on another cell, passing it the entire
Formula column as the input argument range. An implementation of doit is
found below

Public Function doit(theRange As Range) As Long
'Doing something with theRange seems to trigger the performance
'bottleneck
Dim dummy as Integer
If IsEmpty(theRange) Then dummy = 0

'Count number of times this function is called
static numberOfCalls as Long
numberOfCalls = numberOfCals+1

doit=numberOfCalls
End Function
--------------

PROBLEM OBSERVATIONS
--------------------------
1)
The performance hit is due to the doit function intermitently being
called
14,000 times by excel (once for each cell in the input argument range)
This happens most often though not always when requesting a Full recalc
with
dependency tree rebuild (Ctrl+Alt+Shift+F9) though it also happens,
albeit
less frequently with all other calculate requests, namely:
Full Recalc (Ctrl+Alt+F9)
Recalc (F9)
Sheet Recalc (Shift+F9)

2)
Note that this DOES NOT seem to happen if instead of the Formula Column,
we
pass the Static column to the doit function

3)
Note also that this does not seem to happen if the column size is small

4)
In a more complex spreadsheet with several "doit" calls referring to
various
large formula ranges, this intermitent 14,000 times call repeat problem
occurs much more often as it seems to affect each doit cell at different
times. The more doIt cells there are the more chances for the problem to
occur. INterestingly though I have not yet seen an ocasion when it's
affected
more than one doit cell at a time.


PROBLEM ANALYSIS
----------------------
I am experiencing this using Excel 2003 and my thoughts on what is
happening
based on the observations above are the following:

1)
The fact that the call is repeated once for each cell in the passed
range
(14,000 times) and that the problem occurs most often when requesting a
Full
Recalc with dependency tree rebuild leads me to beleive that the problem
is
caused by the dependency tree logic

2)
The fact that the problem only manifest itself when the passed range is
made
up of formulas which themselves depend on other cells, also points to a
dependency tree problem as clearly this setup has a higher level of
dependencies than when the passed range is just the static data column

3)
The fact that the problem does not manifest itself for small ranges
points
to a possible Excel 2003 dependency limit breach causing Excel to always
do a
full recalc rather than store a large dependency tree. However, why is
the
behaviour intermitent? And why is the limit breached for such a small
spreadsheet. (Read more on limits on these links)
http://blogs.msdn.com/excel/archive/...26/474258.aspx
http://www.decisionmodels.com/calcsecretsf.htm

5)
I am also surprised that the problem does not occur unless the UDF
mentions
the passed range in the body of the code. ie the range being in the
argment
list on its own does not cause the problem. It's only when you try t use
the
argument in the body


WHAT NEXT?
--------------

I would be very greatful if anyone with experience in building large
spreadsheets or with experience of this particular problem could share
how
they've been getting around this issue

Thanks so much











All times are GMT +1. The time now is 12:28 PM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com