View Single Post
  #11   Report Post  
Posted to microsoft.public.excel.programming
MikeZz MikeZz is offline
external usenet poster
 
Posts: 152
Default Very Slow reading excel data into an array (while opened in ne

Hi JP, I'll try to answer all your questions.
First, I went down the path of creating new instances because I had a huge
memory problem ( Excel kept going higher and higher bringing it to a crawl ).
I tried creating a new instance because it seemed to resolve the issue. The
problem then became that it took forever just to read in the data... No hard
data but just looking at the clock I'd say it went from "instantly" to
several seconds.

Responding to your questions in a previous post:
First, let me explain the purpose of this app. Our customer sends us
hundreds of contracts each month in pdfs. The information is structured so
poorly it's very difficult to review... my guess is it's done this way
intentionally. If you saw it, you wouldn't believe it.

We use a batch pdf converter to convert the files to excel. I then wrote an
app that can pick bits and pieces of info spread all over a very chaotic the
file and summarize it in an excel table. The file itself is about 95% blank
cells and 99% of the actual data is useless. We only need the 1% that's
randomly spread in between. The end result of my app was really a quantum
leap in productivity. We would have someone manually read and retype a lot
of information from pdf contracts taking many hours a week. Now it's
relatively instant with a dramatic decrease in time with a dramatic increase
in accuracy of contract review.

Basically, I'm trying to fix a problem that in all honesty, could be fixed
by the customer just by reformating their contracts. The thing is that if
they make it easier to read, it would help to highlight errors that are
sometimes intentional... because it gives them more money. Harsh but true.

So anyway, I have all the logic to scan through an array to pluck the few
bits of useful info out of the file. It works and it works extremely well.
I just need to get the data into an array.

1)... Because of the above described randomness, a pivot table wouldn't work
because the files I'm opening are not in a table format.

2) I put the files into the same folder. They are just translations from
pdf so I don't update them. The ranges are all random. Basically I have to
read the entire file into an array... which is usually under 20 columns but
could be a couple thousand rows long.

3) I can't simply copy and paste data... like I said, what I'm doing is
like taking a shotgun blast, and putting 1 out of 100 pellets in a nice
straight line.

4) I've went through it pretty throughly. It's extremely complicated but I
think I've done a good job of getting it to solve a difficult problem.

5) The few routines I pasted are only a fraction of the total code. You
are seeing the entire Declaration which may be quite extensive.
colFileName = col0: col0 = col0 + 1
I've found the above to actually work rather nicely because in development,
I sometimes change the output table and this allows the most flexibility to
add or move data columns around.

6) r = 1 To rowsMaster
Part of my poor naming because I actually pulled this code from another
project.
Master is just the data file and arrMaster is just the temp array containing
the data found in the file called "Master".
I'll try the technique you mentioned as it is probably a lot quicker.
Is there a quick way to call range a1 to the "bottom right" of the file?

A couple other comments from Joel's post I'll also answer he
I occaisionally auto save the summary file since it does get bigger and this
solved a previous out of memory error I had before, however, a summary file
with 150 contracts summarized in it is still only about 6MB of file size so
it's not much. I regularly deal with excel files pushing 30+MB without
issues.

.... Ok, after all that, I just tried the bulk read a range into an array
like you suggested and that it seemed to help ALOT. Now, if I pursue this
method of creating new xlApps, how do I stop my screen from blinking like
crazy... Basically before creating all the new apps, I could run the macro
and do anything else outside of excel. Now whenever the a file opens to read
(about 2x a second), it moves focus from whatever I'm doing back to excel...
can't even catch up on the news any more!

Thanks again for all your help... I'd be amazed if just a single command
will improve my code that much but hey, I'll take it any way I can get.

MikeZz

"JP" wrote:

But why are we even creating a new instance of the Excel application
at all? This code is running natively inside Excel (if I assume
correctly).

--JP


On Sep 13, 10:26 am, Joel wrote:
I believe this is a memory leak

Set xl = Nothing

You aren't releasing any memory. All you are doing is setting the variable
xl to nothing. The memory is still allocated to the program.

Try this

Sub MyMacro()
Dim xl
Set xl = CreateObject("word.application")
xl.Visible = True
Set xl = Nothing

End Sub

The word document stays opened. Close the word document manually. If the
object was invisiable it would still be running. You can prove this by not
making the object visible and creating the object in the code above. Then go
to Task Manager. You will still see Word running. Just because you cna't
see it running doesn't mean that it is NOT running.

Now this

Sub MyMacro()
Dim xl
Set xl = CreateObject("word.application")
xl.Visible = True

xl.Quit

End Sub