![]() |
Opening/Closing Large Files
On Jan 24, 7:49 am, "Nigel" wrote:
I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A Sub ProcessFiles() Dim dataWb As Workbook Dim xr As Long, nextData As Long Application.EnableEvents = False Application.ScreenUpdating = False ' start row for data nextData = 1 ' process each workbook For xr = 1 To 1350 ' open data workbook Set dataWb = Workbooks.Open("H:\myFiles\" & shControl.Cells(xr, 1)) ' get data With dataWb.Sheets("Appraisal") shControl.Cells(nextData, 4) = .Cells(10, "I").Value shControl.Cells(nextData, 5) = .Cells(11, "I").Value shControl.Cells(nextData, 6) = .Cells(12, "I").Value End With ' advance o/p row nextData = nextData + 1 ' close workbook dataWb.Close False Next xr Application.EnableEvents = True Application.ScreenUpdating = True End Sub -- Regards, Nigel Look at John Walkenbach's site for article on reading from closed workbooks: http://j-walk.com/ss/excel/tips/tip82.htm Also: shControl.Cells(nextData, 4) might be shControl.Cells(nextData, 4).value Maybe you should set dataWb=nothing on every loop. |
Opening/Closing Large Files
I don't have a good suggestion.
Maybe a bad one. Keep track of what file you're processing and after a few (before it slows down to a crawl), close (and save the file), close excel, reopen excel and your file and do the next batch. Maybe closing excel woudn't be necessary??? ====== And I've never had to do this with 1350 workbooks, but since you're just retrieving 3 values, maybe it would be quicker to build formulas and plop them into the cells directly. If it slows down after a bunch, then do it in groups and convert the formulas to values for each bunch. Just guesses, though. I'd test on a much smaller number of files, too. Nigel wrote: I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A Sub ProcessFiles() Dim dataWb As Workbook Dim xr As Long, nextData As Long Application.EnableEvents = False Application.ScreenUpdating = False ' start row for data nextData = 1 ' process each workbook For xr = 1 To 1350 ' open data workbook Set dataWb = Workbooks.Open("H:\myFiles\" & shControl.Cells(xr, 1)) ' get data With dataWb.Sheets("Appraisal") shControl.Cells(nextData, 4) = .Cells(10, "I").Value shControl.Cells(nextData, 5) = .Cells(11, "I").Value shControl.Cells(nextData, 6) = .Cells(12, "I").Value End With ' advance o/p row nextData = nextData + 1 ' close workbook dataWb.Close False Next xr Application.EnableEvents = True Application.ScreenUpdating = True End Sub -- Regards, Nigel -- Dave Peterson |
Opening/Closing Large Files
Thanks for the tip, the get data from closed file using John Walkenbach code
looks interesting. I will try that out. -- Regards, Nigel "dbKemp" wrote in message ... On Jan 24, 7:49 am, "Nigel" wrote: I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A Sub ProcessFiles() Dim dataWb As Workbook Dim xr As Long, nextData As Long Application.EnableEvents = False Application.ScreenUpdating = False ' start row for data nextData = 1 ' process each workbook For xr = 1 To 1350 ' open data workbook Set dataWb = Workbooks.Open("H:\myFiles\" & shControl.Cells(xr, 1)) ' get data With dataWb.Sheets("Appraisal") shControl.Cells(nextData, 4) = .Cells(10, "I").Value shControl.Cells(nextData, 5) = .Cells(11, "I").Value shControl.Cells(nextData, 6) = .Cells(12, "I").Value End With ' advance o/p row nextData = nextData + 1 ' close workbook dataWb.Close False Next xr Application.EnableEvents = True Application.ScreenUpdating = True End Sub -- Regards, Nigel Look at John Walkenbach's site for article on reading from closed workbooks: http://j-walk.com/ss/excel/tips/tip82.htm Also: shControl.Cells(nextData, 4) might be shControl.Cells(nextData, 4).value Maybe you should set dataWb=nothing on every loop. |
Opening/Closing Large Files
Hi Dave,
Well it might come to that. . The files are tracked and I am considering creating 10 sub folders with less then 135 files each as this number appears to work OK. So to get it done I will probably do that. I do want to discover what is causing this breakdown though so I shall run some more tests as I am not sure why the system slows or indeed if it is Excel or the OpSys. Cheers -- Regards, Nigel "Dave Peterson" wrote in message ... I don't have a good suggestion. Maybe a bad one. Keep track of what file you're processing and after a few (before it slows down to a crawl), close (and save the file), close excel, reopen excel and your file and do the next batch. Maybe closing excel woudn't be necessary??? ====== And I've never had to do this with 1350 workbooks, but since you're just retrieving 3 values, maybe it would be quicker to build formulas and plop them into the cells directly. If it slows down after a bunch, then do it in groups and convert the formulas to values for each bunch. Just guesses, though. I'd test on a much smaller number of files, too. Nigel wrote: I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A Sub ProcessFiles() Dim dataWb As Workbook Dim xr As Long, nextData As Long Application.EnableEvents = False Application.ScreenUpdating = False ' start row for data nextData = 1 ' process each workbook For xr = 1 To 1350 ' open data workbook Set dataWb = Workbooks.Open("H:\myFiles\" & shControl.Cells(xr, 1)) ' get data With dataWb.Sheets("Appraisal") shControl.Cells(nextData, 4) = .Cells(10, "I").Value shControl.Cells(nextData, 5) = .Cells(11, "I").Value shControl.Cells(nextData, 6) = .Cells(12, "I").Value End With ' advance o/p row nextData = nextData + 1 ' close workbook dataWb.Close False Next xr Application.EnableEvents = True Application.ScreenUpdating = True End Sub -- Regards, Nigel -- Dave Peterson |
Opening/Closing Large Files
I'd really consider building the formulas--I'm not sure if you read that
portion. Nigel wrote: Hi Dave, Well it might come to that. . The files are tracked and I am considering creating 10 sub folders with less then 135 files each as this number appears to work OK. So to get it done I will probably do that. I do want to discover what is causing this breakdown though so I shall run some more tests as I am not sure why the system slows or indeed if it is Excel or the OpSys. Cheers -- Regards, Nigel "Dave Peterson" wrote in message ... I don't have a good suggestion. Maybe a bad one. Keep track of what file you're processing and after a few (before it slows down to a crawl), close (and save the file), close excel, reopen excel and your file and do the next batch. Maybe closing excel woudn't be necessary??? ====== And I've never had to do this with 1350 workbooks, but since you're just retrieving 3 values, maybe it would be quicker to build formulas and plop them into the cells directly. If it slows down after a bunch, then do it in groups and convert the formulas to values for each bunch. Just guesses, though. I'd test on a much smaller number of files, too. Nigel wrote: I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A Sub ProcessFiles() Dim dataWb As Workbook Dim xr As Long, nextData As Long Application.EnableEvents = False Application.ScreenUpdating = False ' start row for data nextData = 1 ' process each workbook For xr = 1 To 1350 ' open data workbook Set dataWb = Workbooks.Open("H:\myFiles\" & shControl.Cells(xr, 1)) ' get data With dataWb.Sheets("Appraisal") shControl.Cells(nextData, 4) = .Cells(10, "I").Value shControl.Cells(nextData, 5) = .Cells(11, "I").Value shControl.Cells(nextData, 6) = .Cells(12, "I").Value End With ' advance o/p row nextData = nextData + 1 ' close workbook dataWb.Close False Next xr Application.EnableEvents = True Application.ScreenUpdating = True End Sub -- Regards, Nigel -- Dave Peterson -- Dave Peterson |
Opening/Closing Large Files
Nigel, Since I am not on a network, I hesitate to reply. Your other responses cover most of the ideas I considered, however, you also might try using an object reference instead of the sheet code name - "shControl". Also, is there any kind of security protection on the network that might shutdown access after repeated access requests? That might be something your IT people could have set up. -- Jim Cone San Francisco, USA http://www.realezsites.com/bus/primitivesoftware (Excel Add-ins / Excel Programming) "Nigel" wrote in message I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A -snip- Regards, Nigel |
Opening/Closing Large Files
Hi Jim
The object reference I will try. I have just checked the PC resources and it does not appear to be accumulating more, as you might expect the resources go down as the file is opened and return to similar levels after the file is closed. I cannot see any leakage and I am stumped. Good point on the network front, I need to check with the IT guys, but this slowdown is progressive as more files get processed it get slower. From 5 secs per file at the start down to 15 secs per file by 100 files, 30 secs per file by 200.....etc. Not sure the network would slow like this unless there is some form of bandwidth throttling? -- Regards, Nigel "Jim Cone" wrote in message ... Nigel, Since I am not on a network, I hesitate to reply. Your other responses cover most of the ideas I considered, however, you also might try using an object reference instead of the sheet code name - "shControl". Also, is there any kind of security protection on the network that might shutdown access after repeated access requests? That might be something your IT people could have set up. -- Jim Cone San Francisco, USA http://www.realezsites.com/bus/primitivesoftware (Excel Add-ins / Excel Programming) "Nigel" wrote in message I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A -snip- Regards, Nigel |
Opening/Closing Large Files
Yes, I did read it and am going to give it a try, together with the read
closed file approach from John W site. Thanks -- Regards, Nigel "Dave Peterson" wrote in message ... I'd really consider building the formulas--I'm not sure if you read that portion. Nigel wrote: Hi Dave, Well it might come to that. . The files are tracked and I am considering creating 10 sub folders with less then 135 files each as this number appears to work OK. So to get it done I will probably do that. I do want to discover what is causing this breakdown though so I shall run some more tests as I am not sure why the system slows or indeed if it is Excel or the OpSys. Cheers -- Regards, Nigel "Dave Peterson" wrote in message ... I don't have a good suggestion. Maybe a bad one. Keep track of what file you're processing and after a few (before it slows down to a crawl), close (and save the file), close excel, reopen excel and your file and do the next batch. Maybe closing excel woudn't be necessary??? ====== And I've never had to do this with 1350 workbooks, but since you're just retrieving 3 values, maybe it would be quicker to build formulas and plop them into the cells directly. If it slows down after a bunch, then do it in groups and convert the formulas to values for each bunch. Just guesses, though. I'd test on a much smaller number of files, too. Nigel wrote: I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A Sub ProcessFiles() Dim dataWb As Workbook Dim xr As Long, nextData As Long Application.EnableEvents = False Application.ScreenUpdating = False ' start row for data nextData = 1 ' process each workbook For xr = 1 To 1350 ' open data workbook Set dataWb = Workbooks.Open("H:\myFiles\" & shControl.Cells(xr, 1)) ' get data With dataWb.Sheets("Appraisal") shControl.Cells(nextData, 4) = .Cells(10, "I").Value shControl.Cells(nextData, 5) = .Cells(11, "I").Value shControl.Cells(nextData, 6) = .Cells(12, "I").Value End With ' advance o/p row nextData = nextData + 1 ' close workbook dataWb.Close False Next xr Application.EnableEvents = True Application.ScreenUpdating = True End Sub -- Regards, Nigel -- Dave Peterson -- Dave Peterson |
Opening/Closing Large Files
I suspect its the string pool ...
AFAIK the way Excel works it stores every unique string it finds in the string pool and then every time it finds a string it looks it up in the string pool to see if it exists. I think the string pool only ever gets larger and slower within an excel session. If it is the root cause of the problem then the only solution is to close Excel partway through and start from where you left off. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Nigel" wrote in message ... Hi Jim The object reference I will try. I have just checked the PC resources and it does not appear to be accumulating more, as you might expect the resources go down as the file is opened and return to similar levels after the file is closed. I cannot see any leakage and I am stumped. Good point on the network front, I need to check with the IT guys, but this slowdown is progressive as more files get processed it get slower. From 5 secs per file at the start down to 15 secs per file by 100 files, 30 secs per file by 200.....etc. Not sure the network would slow like this unless there is some form of bandwidth throttling? -- Regards, Nigel "Jim Cone" wrote in message ... Nigel, Since I am not on a network, I hesitate to reply. Your other responses cover most of the ideas I considered, however, you also might try using an object reference instead of the sheet code name - "shControl". Also, is there any kind of security protection on the network that might shutdown access after repeated access requests? That might be something your IT people could have set up. -- Jim Cone San Francisco, USA http://www.realezsites.com/bus/primitivesoftware (Excel Add-ins / Excel Programming) "Nigel" wrote in message I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A -snip- Regards, Nigel |
Opening/Closing Large Files
Good call. I read the MS article on the string pool and one fix is to
create a large string pool by filling a temporary sheet with a large number of strings. When re-opened this pool results in Excel not adding more as it already exists. Sample code below, which I am about to put to the test. Public Sub GrowStringPool() Dim row As Integer Dim col As Integer Dim text As String text = "Here is some text:" For col = 1 To 21 For row = 1 To 1000 Cells(row, col).Value = text + Str(row * col) row = row + 1 Next col = col + 1 Next End Sub -- Regards, Nigel "Charles Williams" wrote in message ... I suspect its the string pool ... AFAIK the way Excel works it stores every unique string it finds in the string pool and then every time it finds a string it looks it up in the string pool to see if it exists. I think the string pool only ever gets larger and slower within an excel session. If it is the root cause of the problem then the only solution is to close Excel partway through and start from where you left off. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Nigel" wrote in message ... Hi Jim The object reference I will try. I have just checked the PC resources and it does not appear to be accumulating more, as you might expect the resources go down as the file is opened and return to similar levels after the file is closed. I cannot see any leakage and I am stumped. Good point on the network front, I need to check with the IT guys, but this slowdown is progressive as more files get processed it get slower. From 5 secs per file at the start down to 15 secs per file by 100 files, 30 secs per file by 200.....etc. Not sure the network would slow like this unless there is some form of bandwidth throttling? -- Regards, Nigel "Jim Cone" wrote in message ... Nigel, Since I am not on a network, I hesitate to reply. Your other responses cover most of the ideas I considered, however, you also might try using an object reference instead of the sheet code name - "shControl". Also, is there any kind of security protection on the network that might shutdown access after repeated access requests? That might be something your IT people could have set up. -- Jim Cone San Francisco, USA http://www.realezsites.com/bus/primitivesoftware (Excel Add-ins / Excel Programming) "Nigel" wrote in message I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A -snip- Regards, Nigel |
Opening/Closing Large Files
It worked!
I added a large string pool, re-opened the file and run my code. The total load time for a few files trebled. Clearing the sheet with the strings on and re-running the code, reduced the load time back. This appears to be the evidence that the string pool is an issue. I am now testing with a high volume of files. Thanks for the help -- Regards, Nigel "Nigel" wrote in message ... Good call. I read the MS article on the string pool and one fix is to create a large string pool by filling a temporary sheet with a large number of strings. When re-opened this pool results in Excel not adding more as it already exists. Sample code below, which I am about to put to the test. Public Sub GrowStringPool() Dim row As Integer Dim col As Integer Dim text As String text = "Here is some text:" For col = 1 To 21 For row = 1 To 1000 Cells(row, col).Value = text + Str(row * col) row = row + 1 Next col = col + 1 Next End Sub -- Regards, Nigel "Charles Williams" wrote in message ... I suspect its the string pool ... AFAIK the way Excel works it stores every unique string it finds in the string pool and then every time it finds a string it looks it up in the string pool to see if it exists. I think the string pool only ever gets larger and slower within an excel session. If it is the root cause of the problem then the only solution is to close Excel partway through and start from where you left off. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Nigel" wrote in message ... Hi Jim The object reference I will try. I have just checked the PC resources and it does not appear to be accumulating more, as you might expect the resources go down as the file is opened and return to similar levels after the file is closed. I cannot see any leakage and I am stumped. Good point on the network front, I need to check with the IT guys, but this slowdown is progressive as more files get processed it get slower. From 5 secs per file at the start down to 15 secs per file by 100 files, 30 secs per file by 200.....etc. Not sure the network would slow like this unless there is some form of bandwidth throttling? -- Regards, Nigel "Jim Cone" wrote in message ... Nigel, Since I am not on a network, I hesitate to reply. Your other responses cover most of the ideas I considered, however, you also might try using an object reference instead of the sheet code name - "shControl". Also, is there any kind of security protection on the network that might shutdown access after repeated access requests? That might be something your IT people could have set up. -- Jim Cone San Francisco, USA http://www.realezsites.com/bus/primitivesoftware (Excel Add-ins / Excel Programming) "Nigel" wrote in message I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A -snip- Regards, Nigel |
Opening/Closing Large Files
I had not seen the MSKB article before, well found.
Let us know if the suggested bypass does indeed speed things up. regards Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Nigel" wrote in message ... It worked! I added a large string pool, re-opened the file and run my code. The total load time for a few files trebled. Clearing the sheet with the strings on and re-running the code, reduced the load time back. This appears to be the evidence that the string pool is an issue. I am now testing with a high volume of files. Thanks for the help -- Regards, Nigel "Nigel" wrote in message ... Good call. I read the MS article on the string pool and one fix is to create a large string pool by filling a temporary sheet with a large number of strings. When re-opened this pool results in Excel not adding more as it already exists. Sample code below, which I am about to put to the test. Public Sub GrowStringPool() Dim row As Integer Dim col As Integer Dim text As String text = "Here is some text:" For col = 1 To 21 For row = 1 To 1000 Cells(row, col).Value = text + Str(row * col) row = row + 1 Next col = col + 1 Next End Sub -- Regards, Nigel "Charles Williams" wrote in message ... I suspect its the string pool ... AFAIK the way Excel works it stores every unique string it finds in the string pool and then every time it finds a string it looks it up in the string pool to see if it exists. I think the string pool only ever gets larger and slower within an excel session. If it is the root cause of the problem then the only solution is to close Excel partway through and start from where you left off. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Nigel" wrote in message ... Hi Jim The object reference I will try. I have just checked the PC resources and it does not appear to be accumulating more, as you might expect the resources go down as the file is opened and return to similar levels after the file is closed. I cannot see any leakage and I am stumped. Good point on the network front, I need to check with the IT guys, but this slowdown is progressive as more files get processed it get slower. From 5 secs per file at the start down to 15 secs per file by 100 files, 30 secs per file by 200.....etc. Not sure the network would slow like this unless there is some form of bandwidth throttling? -- Regards, Nigel "Jim Cone" wrote in message ... Nigel, Since I am not on a network, I hesitate to reply. Your other responses cover most of the ideas I considered, however, you also might try using an object reference instead of the sheet code name - "shControl". Also, is there any kind of security protection on the network that might shutdown access after repeated access requests? That might be something your IT people could have set up. -- Jim Cone San Francisco, USA http://www.realezsites.com/bus/primitivesoftware (Excel Add-ins / Excel Programming) "Nigel" wrote in message I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A -snip- Regards, Nigel |
Opening/Closing Large Files
Just trying to summarize...
This KB article?? http://support.microsoft.com/kb/911580/en-us Importing data may take longer than expected or Excel may appear to stop responding (hang) when you import data into a workbook in Excel 2002 or in Office Excel 2003 But I'm still kind of confused at the solution for Nigel. (This is taken mostly from that KB article.) 1. Start Excel, and open the workbook where you want to import the data. 2-6 . Add a temporary worksheet to the workbook and fill it with a bunch of strings 7. Save the workbook. 8. Close and reopen the workbook. Excel will evaluate that the workbook already contains a large existing string pool. This will prevent the addition of new string data into the existing string pool. 9. Delete the temporary worksheet that you added in step 2. 10. Nigel would start his routine that gets data from his 1350 other workbooks. I wasn't sure if step 10 should come before step 9. Charles Williams wrote: I had not seen the MSKB article before, well found. Let us know if the suggested bypass does indeed speed things up. regards Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Nigel" wrote in message ... It worked! I added a large string pool, re-opened the file and run my code. The total load time for a few files trebled. Clearing the sheet with the strings on and re-running the code, reduced the load time back. This appears to be the evidence that the string pool is an issue. I am now testing with a high volume of files. Thanks for the help -- Regards, Nigel "Nigel" wrote in message ... Good call. I read the MS article on the string pool and one fix is to create a large string pool by filling a temporary sheet with a large number of strings. When re-opened this pool results in Excel not adding more as it already exists. Sample code below, which I am about to put to the test. Public Sub GrowStringPool() Dim row As Integer Dim col As Integer Dim text As String text = "Here is some text:" For col = 1 To 21 For row = 1 To 1000 Cells(row, col).Value = text + Str(row * col) row = row + 1 Next col = col + 1 Next End Sub -- Regards, Nigel "Charles Williams" wrote in message ... I suspect its the string pool ... AFAIK the way Excel works it stores every unique string it finds in the string pool and then every time it finds a string it looks it up in the string pool to see if it exists. I think the string pool only ever gets larger and slower within an excel session. If it is the root cause of the problem then the only solution is to close Excel partway through and start from where you left off. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Nigel" wrote in message ... Hi Jim The object reference I will try. I have just checked the PC resources and it does not appear to be accumulating more, as you might expect the resources go down as the file is opened and return to similar levels after the file is closed. I cannot see any leakage and I am stumped. Good point on the network front, I need to check with the IT guys, but this slowdown is progressive as more files get processed it get slower. From 5 secs per file at the start down to 15 secs per file by 100 files, 30 secs per file by 200.....etc. Not sure the network would slow like this unless there is some form of bandwidth throttling? -- Regards, Nigel "Jim Cone" wrote in message ... Nigel, Since I am not on a network, I hesitate to reply. Your other responses cover most of the ideas I considered, however, you also might try using an object reference instead of the sheet code name - "shControl". Also, is there any kind of security protection on the network that might shutdown access after repeated access requests? That might be something your IT people could have set up. -- Jim Cone San Francisco, USA http://www.realezsites.com/bus/primitivesoftware (Excel Add-ins / Excel Programming) "Nigel" wrote in message I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A -snip- Regards, Nigel -- Dave Peterson |
Opening/Closing Large Files
Well, IF it works I imagine it would presumably be because
- Excel has an internal optimisation which says the string table has a maximum size (<21000 entries in these Excel versions, wonder what size it is in Excel2007), and if you open a workbook containing a string table of this size Excel sets a session variable to say do not add any more strings to the table. - deleting the sheet that the string table was derived from empties the string table but does not reset the Do_not_add switch - because the string table is empty and the do_not_add switch is on it is very fast to search the string table and no time is used adding anything to the string table (the time for both these operations would presumably be large for a large string table). So if this guess/explanation is correct yes you should do step 9 before step 10. But I am not convinced that all this is going to work any faster! BTW we could probably find out by experiment what the max size of the string table is for Excel 2007 since its a separate component inside the new file format. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Dave Peterson" wrote in message ... Just trying to summarize... This KB article?? http://support.microsoft.com/kb/911580/en-us Importing data may take longer than expected or Excel may appear to stop responding (hang) when you import data into a workbook in Excel 2002 or in Office Excel 2003 But I'm still kind of confused at the solution for Nigel. (This is taken mostly from that KB article.) 1. Start Excel, and open the workbook where you want to import the data. 2-6 . Add a temporary worksheet to the workbook and fill it with a bunch of strings 7. Save the workbook. 8. Close and reopen the workbook. Excel will evaluate that the workbook already contains a large existing string pool. This will prevent the addition of new string data into the existing string pool. 9. Delete the temporary worksheet that you added in step 2. 10. Nigel would start his routine that gets data from his 1350 other workbooks. I wasn't sure if step 10 should come before step 9. Charles Williams wrote: I had not seen the MSKB article before, well found. Let us know if the suggested bypass does indeed speed things up. regards Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Nigel" wrote in message ... It worked! I added a large string pool, re-opened the file and run my code. The total load time for a few files trebled. Clearing the sheet with the strings on and re-running the code, reduced the load time back. This appears to be the evidence that the string pool is an issue. I am now testing with a high volume of files. Thanks for the help -- Regards, Nigel "Nigel" wrote in message ... Good call. I read the MS article on the string pool and one fix is to create a large string pool by filling a temporary sheet with a large number of strings. When re-opened this pool results in Excel not adding more as it already exists. Sample code below, which I am about to put to the test. Public Sub GrowStringPool() Dim row As Integer Dim col As Integer Dim text As String text = "Here is some text:" For col = 1 To 21 For row = 1 To 1000 Cells(row, col).Value = text + Str(row * col) row = row + 1 Next col = col + 1 Next End Sub -- Regards, Nigel "Charles Williams" wrote in message ... I suspect its the string pool ... AFAIK the way Excel works it stores every unique string it finds in the string pool and then every time it finds a string it looks it up in the string pool to see if it exists. I think the string pool only ever gets larger and slower within an excel session. If it is the root cause of the problem then the only solution is to close Excel partway through and start from where you left off. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Nigel" wrote in message ... Hi Jim The object reference I will try. I have just checked the PC resources and it does not appear to be accumulating more, as you might expect the resources go down as the file is opened and return to similar levels after the file is closed. I cannot see any leakage and I am stumped. Good point on the network front, I need to check with the IT guys, but this slowdown is progressive as more files get processed it get slower. From 5 secs per file at the start down to 15 secs per file by 100 files, 30 secs per file by 200.....etc. Not sure the network would slow like this unless there is some form of bandwidth throttling? -- Regards, Nigel "Jim Cone" wrote in message ... Nigel, Since I am not on a network, I hesitate to reply. Your other responses cover most of the ideas I considered, however, you also might try using an object reference instead of the sheet code name - "shControl". Also, is there any kind of security protection on the network that might shutdown access after repeated access requests? That might be something your IT people could have set up. -- Jim Cone San Francisco, USA http://www.realezsites.com/bus/primitivesoftware (Excel Add-ins / Excel Programming) "Nigel" wrote in message I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A -snip- Regards, Nigel -- Dave Peterson |
Opening/Closing Large Files
Thanks for your thoughts, Charles.
My gut feeling is that it would still be quicker to build formulas to retrieve just 3 values from each of the 1350 workbooks. Do you have a recommendation for Nigel? Charles Williams wrote: Well, IF it works I imagine it would presumably be because - Excel has an internal optimisation which says the string table has a maximum size (<21000 entries in these Excel versions, wonder what size it is in Excel2007), and if you open a workbook containing a string table of this size Excel sets a session variable to say do not add any more strings to the table. - deleting the sheet that the string table was derived from empties the string table but does not reset the Do_not_add switch - because the string table is empty and the do_not_add switch is on it is very fast to search the string table and no time is used adding anything to the string table (the time for both these operations would presumably be large for a large string table). So if this guess/explanation is correct yes you should do step 9 before step 10. But I am not convinced that all this is going to work any faster! BTW we could probably find out by experiment what the max size of the string table is for Excel 2007 since its a separate component inside the new file format. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Dave Peterson" wrote in message ... Just trying to summarize... This KB article?? http://support.microsoft.com/kb/911580/en-us Importing data may take longer than expected or Excel may appear to stop responding (hang) when you import data into a workbook in Excel 2002 or in Office Excel 2003 But I'm still kind of confused at the solution for Nigel. (This is taken mostly from that KB article.) 1. Start Excel, and open the workbook where you want to import the data. 2-6 . Add a temporary worksheet to the workbook and fill it with a bunch of strings 7. Save the workbook. 8. Close and reopen the workbook. Excel will evaluate that the workbook already contains a large existing string pool. This will prevent the addition of new string data into the existing string pool. 9. Delete the temporary worksheet that you added in step 2. 10. Nigel would start his routine that gets data from his 1350 other workbooks. I wasn't sure if step 10 should come before step 9. Charles Williams wrote: I had not seen the MSKB article before, well found. Let us know if the suggested bypass does indeed speed things up. regards Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Nigel" wrote in message ... It worked! I added a large string pool, re-opened the file and run my code. The total load time for a few files trebled. Clearing the sheet with the strings on and re-running the code, reduced the load time back. This appears to be the evidence that the string pool is an issue. I am now testing with a high volume of files. Thanks for the help -- Regards, Nigel "Nigel" wrote in message ... Good call. I read the MS article on the string pool and one fix is to create a large string pool by filling a temporary sheet with a large number of strings. When re-opened this pool results in Excel not adding more as it already exists. Sample code below, which I am about to put to the test. Public Sub GrowStringPool() Dim row As Integer Dim col As Integer Dim text As String text = "Here is some text:" For col = 1 To 21 For row = 1 To 1000 Cells(row, col).Value = text + Str(row * col) row = row + 1 Next col = col + 1 Next End Sub -- Regards, Nigel "Charles Williams" wrote in message ... I suspect its the string pool ... AFAIK the way Excel works it stores every unique string it finds in the string pool and then every time it finds a string it looks it up in the string pool to see if it exists. I think the string pool only ever gets larger and slower within an excel session. If it is the root cause of the problem then the only solution is to close Excel partway through and start from where you left off. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Nigel" wrote in message ... Hi Jim The object reference I will try. I have just checked the PC resources and it does not appear to be accumulating more, as you might expect the resources go down as the file is opened and return to similar levels after the file is closed. I cannot see any leakage and I am stumped. Good point on the network front, I need to check with the IT guys, but this slowdown is progressive as more files get processed it get slower. From 5 secs per file at the start down to 15 secs per file by 100 files, 30 secs per file by 200.....etc. Not sure the network would slow like this unless there is some form of bandwidth throttling? -- Regards, Nigel "Jim Cone" wrote in message ... Nigel, Since I am not on a network, I hesitate to reply. Your other responses cover most of the ideas I considered, however, you also might try using an object reference instead of the sheet code name - "shControl". Also, is there any kind of security protection on the network that might shutdown access after repeated access requests? That might be something your IT people could have set up. -- Jim Cone San Francisco, USA http://www.realezsites.com/bus/primitivesoftware (Excel Add-ins / Excel Programming) "Nigel" wrote in message I have a VBA process that reads thru a single folder (on a corporate network - high performance) of Excel workbooks, sequentially opens, extracts a small number of data, then close without save. Only one workbook is open at anyone time. There are 1350 workbooks which are all about ~7MB in size. When the process first starts files are opened and closed reasonably quickly, at about 12 per minute, as the process progresses it gets slower and slower and eventually appears to stop (no error messages). It appears as if system resources are being consumed as each file is processed that are not released. Is this possible or is something else causing the problem? My code is as follows.... the names of each files to process are already stored on sheet (shcontrol) column A -snip- Regards, Nigel -- Dave Peterson -- Dave Peterson |
Opening/Closing Large Files
Some more thoughts for what they are worth... 1. Those wouldn't all be xl 2003 workbooks that calculate when they open? 2. Could you move the folder to your own drive, do your work and move it back? 3. Dave's idea about using formulas seems more like the way to go. You wouldn't have to open each workbook and since the cell locations are fixed they couldn't be very complicated. -- Jim Cone San Francisco, USA http://www.realezsites.com/bus/primitivesoftware (Excel Add-ins / Excel Programming) "Nigel" wrote in message Hi Jim The object reference I will try. I have just checked the PC resources and it does not appear to be accumulating more, as you might expect the resources go down as the file is opened and return to similar levels after the file is closed. I cannot see any leakage and I am stumped. Good point on the network front, I need to check with the IT guys, but this slowdown is progressive as more files get processed it get slower. From 5 secs per file at the start down to 15 secs per file by 100 files, 30 secs per file by 200.....etc. Not sure the network would slow like this unless there is some form of bandwidth throttling? -- Regards, Nigel |
Opening/Closing Large Files
Yes, I think you are right ... I would think thats the way to go in this
case (assuming that reading from closed workbooks does not build the string table). (but I still want to know if the magic string table trick works!!!) My gut feeling is that it would still be quicker to build formulas to retrieve just 3 values from each of the 1350 workbooks. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm |
Opening/Closing Large Files
I'm interested in Nigel's results, too.
I had to do something similar a longgggg time ago. But I didn't know the worksheet names and had to retrieve more than 3 values. (and not close to 1000 files!) It was much, much quicker to copy the files to my local C: drive and run it there than to open across the network. Charles Williams wrote: Yes, I think you are right ... I would think thats the way to go in this case (assuming that reading from closed workbooks does not build the string table). (but I still want to know if the magic string table trick works!!!) My gut feeling is that it would still be quicker to build formulas to retrieve just 3 values from each of the 1350 workbooks. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm -- Dave Peterson |
Opening/Closing Large Files
Thanks Guys for stimulating the debate, the string pool solution did not
speed it up, it did slow it down in so much that I loaded a large pool as per MS article and timed the open-extract-close steps which increased, I then removed the temporary sheet and the process went back to normal speed. If the limit is 21,000 pool items I guess I reach that at around 700 files the point at which my process as at a near standstill, so I could save-open and resume maybe? However I am working on a formula version as proposed and will be testing that soon. I will let you know how it goes. Thanks for all your advice. -- Regards, Nigel "Dave Peterson" wrote in message ... I'm interested in Nigel's results, too. I had to do something similar a longgggg time ago. But I didn't know the worksheet names and had to retrieve more than 3 values. (and not close to 1000 files!) It was much, much quicker to copy the files to my local C: drive and run it there than to open across the network. Charles Williams wrote: Yes, I think you are right ... I would think thats the way to go in this case (assuming that reading from closed workbooks does not build the string table). (but I still want to know if the magic string table trick works!!!) My gut feeling is that it would still be quicker to build formulas to retrieve just 3 values from each of the 1350 workbooks. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm -- Dave Peterson |
Opening/Closing Large Files
Hi Jim
They are xl2003 and the calculation is not switched to manual, so I will do that but in fact there is not much to calculate in these looks, they contain a lot of forms and formatting, hence the large size. I have not tried to move them locally so that certainly might be an option if the network is the limiting factor. I am working on a formula approach and will be testing it soon. Thanks -- Regards, Nigel "Jim Cone" wrote in message ... Some more thoughts for what they are worth... 1. Those wouldn't all be xl 2003 workbooks that calculate when they open? 2. Could you move the folder to your own drive, do your work and move it back? 3. Dave's idea about using formulas seems more like the way to go. You wouldn't have to open each workbook and since the cell locations are fixed they couldn't be very complicated. -- Jim Cone San Francisco, USA http://www.realezsites.com/bus/primitivesoftware (Excel Add-ins / Excel Programming) "Nigel" wrote in message Hi Jim The object reference I will try. I have just checked the PC resources and it does not appear to be accumulating more, as you might expect the resources go down as the file is opened and return to similar levels after the file is closed. I cannot see any leakage and I am stumped. Good point on the network front, I need to check with the IT guys, but this slowdown is progressive as more files get processed it get slower. From 5 secs per file at the start down to 15 secs per file by 100 files, 30 secs per file by 200.....etc. Not sure the network would slow like this unless there is some form of bandwidth throttling? -- Regards, Nigel |
Opening/Closing Large Files
Thanks for posting back.
I'm not sure I'd try the save, close, reopen, resume stuff. That doesn't sound too different (to me) than just doing the first ### of the 1350 files. (But I've been wrong lots of times.) Nigel wrote: Thanks Guys for stimulating the debate, the string pool solution did not speed it up, it did slow it down in so much that I loaded a large pool as per MS article and timed the open-extract-close steps which increased, I then removed the temporary sheet and the process went back to normal speed. If the limit is 21,000 pool items I guess I reach that at around 700 files the point at which my process as at a near standstill, so I could save-open and resume maybe? However I am working on a formula version as proposed and will be testing that soon. I will let you know how it goes. Thanks for all your advice. -- Regards, Nigel "Dave Peterson" wrote in message ... I'm interested in Nigel's results, too. I had to do something similar a longgggg time ago. But I didn't know the worksheet names and had to retrieve more than 3 values. (and not close to 1000 files!) It was much, much quicker to copy the files to my local C: drive and run it there than to open across the network. Charles Williams wrote: Yes, I think you are right ... I would think thats the way to go in this case (assuming that reading from closed workbooks does not build the string table). (but I still want to know if the magic string table trick works!!!) My gut feeling is that it would still be quicker to build formulas to retrieve just 3 values from each of the 1350 workbooks. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm -- Dave Peterson -- Dave Peterson |
Opening/Closing Large Files
one other thing to check out is the number of files in your temp directory:
large numbers can slow things down Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Nigel" wrote in message ... Thanks Guys for stimulating the debate, the string pool solution did not speed it up, it did slow it down in so much that I loaded a large pool as per MS article and timed the open-extract-close steps which increased, I then removed the temporary sheet and the process went back to normal speed. If the limit is 21,000 pool items I guess I reach that at around 700 files the point at which my process as at a near standstill, so I could save-open and resume maybe? However I am working on a formula version as proposed and will be testing that soon. I will let you know how it goes. Thanks for all your advice. -- Regards, Nigel "Dave Peterson" wrote in message ... I'm interested in Nigel's results, too. I had to do something similar a longgggg time ago. But I didn't know the worksheet names and had to retrieve more than 3 values. (and not close to 1000 files!) It was much, much quicker to copy the files to my local C: drive and run it there than to open across the network. Charles Williams wrote: Yes, I think you are right ... I would think thats the way to go in this case (assuming that reading from closed workbooks does not build the string table). (but I still want to know if the magic string table trick works!!!) My gut feeling is that it would still be quicker to build formulas to retrieve just 3 values from each of the 1350 workbooks. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm -- Dave Peterson |
Opening/Closing Large Files
The formula approach worked. Reduced processing time to under 20 minutes to
make all connections and update data. I chose the do a copy/ replace with values after the full loading of files, but it might have been better to replace as I looped through each file. Thanks for all your input, advice and suggestions. We all learnt something! -- Regards, Nigel "Charles Williams" wrote in message ... one other thing to check out is the number of files in your temp directory: large numbers can slow things down Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm "Nigel" wrote in message ... Thanks Guys for stimulating the debate, the string pool solution did not speed it up, it did slow it down in so much that I loaded a large pool as per MS article and timed the open-extract-close steps which increased, I then removed the temporary sheet and the process went back to normal speed. If the limit is 21,000 pool items I guess I reach that at around 700 files the point at which my process as at a near standstill, so I could save-open and resume maybe? However I am working on a formula version as proposed and will be testing that soon. I will let you know how it goes. Thanks for all your advice. -- Regards, Nigel "Dave Peterson" wrote in message ... I'm interested in Nigel's results, too. I had to do something similar a longgggg time ago. But I didn't know the worksheet names and had to retrieve more than 3 values. (and not close to 1000 files!) It was much, much quicker to copy the files to my local C: drive and run it there than to open across the network. Charles Williams wrote: Yes, I think you are right ... I would think thats the way to go in this case (assuming that reading from closed workbooks does not build the string table). (but I still want to know if the magic string table trick works!!!) My gut feeling is that it would still be quicker to build formulas to retrieve just 3 values from each of the 1350 workbooks. Charles __________________________________________________ Outlines for my Sessions at the Australia Excel Users Group http://www.decisionmodels.com/OZEUC.htm -- Dave Peterson |
Opening/Closing Large Files
Nigel,
You may have to add an "erase dataWb" statement inside the for/next loop. I believe objects may not be overwritten when you re-set them, which may still cause a memory leak, even though you may not actually be running out of memory. Good luck, |
All times are GMT +1. The time now is 01:15 AM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com