Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.misc
|
|||
|
|||
Lessons from optimizing a slow spreadsheet
I've been working on a report that contains YTD sales and cost info. Its
about 4 megs, took a little over 3 minutes to fully recalculate (I know, many around here have larger/slower ones, but thats not the point). The primary data tab has about 4k rows and 50 or so columns, there are 2 reports (top 10 customers and "others) which are about 400 rows and 3000 rows respectively and 37 columns, which to begin with were had about 8 sum products and 7 or so sumifs, one vlookup and the rest were sum or multiplication. There are 2 date fields up top to enter beg and ending to pull specific month's or quarters out of the data. At first each sumifs or sumproduct had 4 conditions, it matched customer, lot #, date after beg date and date before end date. My first step was to add 2 calculation columns called include to the data tab (1 for top 10 report 1 for other report) that was as follows if(and(transaction date=beg date,transaction date<=end date),1,0) and reduced the sumifs and sumproducts to 3 conditions, customer, lot #, include=1. That alone reduced my calc time by about 50% to just over 1.5 min. Next I added calculation columns to the data tab to eliminate the need for sumproducts and replaced all sumproducts with sumifs, this reduced my calc time to under 1 min. Finally I had about 15 dynamic named ranges, I calculated how many rows down my data went just once in a cell using counta(a:a) then referred to that cell in all my ranges. I had used this to limit the # of rows sumproduct had to deal with, but since I had replaced all sumproducts with sumifs, and I had read sumifs can intelligently handle whole column references and not waste time on empty cells, I changed my sumifs to use entire columns and deleted all named ranges, this saved me another 30 seconds, and got me to 28 sec for a full recalc, which seems much more reasonable to me. It seems with a lot of sumifs or sumproducts each extra condition increases recalc time nearly exponentially. My final idea which I have not tried yet and dont really know if it will help would be to sort my data by date, calculate the rows included in each month just once and have my sumifs use that range only, eliminating the need the third condition, so then they would only have to match customer and lot #. Again this seems the most complicated and I dont know if I would gain or lose from it. |
#2
Posted to microsoft.public.excel.misc
|
|||
|
|||
Lessons from optimizing a slow spreadsheet
Check out this link on optomizing speed, calculations and memory...
http://www.decisionmodels.com/index.htm Dynamic named ranges using offset are volatile meaning that the sumproducts using them are always dirty and require calculation. Generally I use dynamic named ranges for vlidation lists or pivot table ranges. If you use them with calculations the calculations had better be very light weight or you will see performance drop. Based on your descripiton have you considered using pivot tables. That would take your calculation overhead down to just about zero. -- HTH... Jim Thomlinson "Robbro" wrote: I've been working on a report that contains YTD sales and cost info. Its about 4 megs, took a little over 3 minutes to fully recalculate (I know, many around here have larger/slower ones, but thats not the point). The primary data tab has about 4k rows and 50 or so columns, there are 2 reports (top 10 customers and "others) which are about 400 rows and 3000 rows respectively and 37 columns, which to begin with were had about 8 sum products and 7 or so sumifs, one vlookup and the rest were sum or multiplication. There are 2 date fields up top to enter beg and ending to pull specific month's or quarters out of the data. At first each sumifs or sumproduct had 4 conditions, it matched customer, lot #, date after beg date and date before end date. My first step was to add 2 calculation columns called include to the data tab (1 for top 10 report 1 for other report) that was as follows if(and(transaction date=beg date,transaction date<=end date),1,0) and reduced the sumifs and sumproducts to 3 conditions, customer, lot #, include=1. That alone reduced my calc time by about 50% to just over 1.5 min. Next I added calculation columns to the data tab to eliminate the need for sumproducts and replaced all sumproducts with sumifs, this reduced my calc time to under 1 min. Finally I had about 15 dynamic named ranges, I calculated how many rows down my data went just once in a cell using counta(a:a) then referred to that cell in all my ranges. I had used this to limit the # of rows sumproduct had to deal with, but since I had replaced all sumproducts with sumifs, and I had read sumifs can intelligently handle whole column references and not waste time on empty cells, I changed my sumifs to use entire columns and deleted all named ranges, this saved me another 30 seconds, and got me to 28 sec for a full recalc, which seems much more reasonable to me. It seems with a lot of sumifs or sumproducts each extra condition increases recalc time nearly exponentially. My final idea which I have not tried yet and dont really know if it will help would be to sort my data by date, calculate the rows included in each month just once and have my sumifs use that range only, eliminating the need the third condition, so then they would only have to match customer and lot #. Again this seems the most complicated and I dont know if I would gain or lose from it. |
#3
Posted to microsoft.public.excel.misc
|
|||
|
|||
Lessons from optimizing a slow spreadsheet
Yes, I have considered, but gave up fairly quickly, we have a number of
issues that I'm not sure can be easily incorporated into pivot tables. We have multiple plants so we have customer by product with summary by plant then customer summary with returns and allowances figured into some numbers, some rebates included at times, manufacturing and material variances and just a whole host of issues in the summary area that at least with my limited knowledge of pivot tables would exclude them from use. I've actually started the project over in access with some nice results, its a bit of a pain to set up but once its done it seems to be what we are looking for. The final layout and information though are still constantly being changed, and once all that is finalized it will probably be an access report instead of a spreadsheet if those above will allow. "Jim Thomlinson" wrote: Check out this link on optomizing speed, calculations and memory... http://www.decisionmodels.com/index.htm Dynamic named ranges using offset are volatile meaning that the sumproducts using them are always dirty and require calculation. Generally I use dynamic named ranges for vlidation lists or pivot table ranges. If you use them with calculations the calculations had better be very light weight or you will see performance drop. Based on your descripiton have you considered using pivot tables. That would take your calculation overhead down to just about zero. -- HTH... Jim Thomlinson "Robbro" wrote: I've been working on a report that contains YTD sales and cost info. Its about 4 megs, took a little over 3 minutes to fully recalculate (I know, many around here have larger/slower ones, but thats not the point). The primary data tab has about 4k rows and 50 or so columns, there are 2 reports (top 10 customers and "others) which are about 400 rows and 3000 rows respectively and 37 columns, which to begin with were had about 8 sum products and 7 or so sumifs, one vlookup and the rest were sum or multiplication. There are 2 date fields up top to enter beg and ending to pull specific month's or quarters out of the data. At first each sumifs or sumproduct had 4 conditions, it matched customer, lot #, date after beg date and date before end date. My first step was to add 2 calculation columns called include to the data tab (1 for top 10 report 1 for other report) that was as follows if(and(transaction date=beg date,transaction date<=end date),1,0) and reduced the sumifs and sumproducts to 3 conditions, customer, lot #, include=1. That alone reduced my calc time by about 50% to just over 1.5 min. Next I added calculation columns to the data tab to eliminate the need for sumproducts and replaced all sumproducts with sumifs, this reduced my calc time to under 1 min. Finally I had about 15 dynamic named ranges, I calculated how many rows down my data went just once in a cell using counta(a:a) then referred to that cell in all my ranges. I had used this to limit the # of rows sumproduct had to deal with, but since I had replaced all sumproducts with sumifs, and I had read sumifs can intelligently handle whole column references and not waste time on empty cells, I changed my sumifs to use entire columns and deleted all named ranges, this saved me another 30 seconds, and got me to 28 sec for a full recalc, which seems much more reasonable to me. It seems with a lot of sumifs or sumproducts each extra condition increases recalc time nearly exponentially. My final idea which I have not tried yet and dont really know if it will help would be to sort my data by date, calculate the rows included in each month just once and have my sumifs use that range only, eliminating the need the third condition, so then they would only have to match customer and lot #. Again this seems the most complicated and I dont know if I would gain or lose from it. |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Help Optimizing a Sheet to improve performance | Excel Worksheet Functions | |||
Microsoft lessons should be listed from basic to advanced | New Users to Excel | |||
Optimizing mortage payback | Excel Worksheet Functions | |||
MapPoint Optimizing | Excel Discussion (Misc queries) | |||
Excel Function lessons Needed! | Excel Worksheet Functions |