View Single Post
  #3   Report Post  
Posted to microsoft.public.excel.misc,microsoft.public.excel.programming,microsoft.public.excel
LunaMoon LunaMoon is offline
external usenet poster
 
Posts: 97
Default seeking the fastest workflow for converting bmp image into Excelo

On Sep 27, 3:59*pm, Joel wrote:
Pardon my french. but this is lunacy. *There are much better methods of
getting Web data into Excel.

1) Use WebQuery. *this may not work with all webpages and data but it is
pretty simple. *From spreadsheet menu Data - Import External Data - New Web
Query. *Put your URL in the address box. *then check the appropriate boxes
where you data is located.

2) Here is some simple code that opens a webpage a downloads data directly
using an Internet Explorer.

Sub GetStaff()

Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URL = "http://www.lehman.com/who/bios"

'get web page
IE.Navigate2 URL
Do While IE.readyState < 4
* *DoEvents
Loop

With Sheets("Sheet1")
* *RowCount = 0
* *State = GetManagement
* *For Each itm In IE.document.all
* * * If itm.tagname = "SPAN" Then
* * * * *RowCount = RowCount + 1
* * * * *.Range("A" & RowCount) = itm.innertext
* * * * *First = True
* * * End If
* * * If itm.tagname = "I" Then
* * * * *.Range("B" & RowCount) = itm.innertext
* * * End If
* * * If itm.tagname = "P" And _
* * * * *itm.tagname < "" Then

* * * * *If First = True Then
* * * * * * First = False
* * * * *Else
* * * * * * RowCount = RowCount + 1
* * * * *End If
* * * * *.Range("C" & RowCount) = itm.innertext
* * * End If
* *Next itm
End With

End Sub

"LunaMoon" wrote:
Hi all,


I have a lot tedious data input that I need to process. Currently I am
doing these whole things manually, which is extremely tiring. Please
help me out:


The workflow:


(1) Browse webpages, the webpages have a lot of numerical tables.
(2) I copy the table into clipboard, at this stage, the table is an
image in the clipboard. BMP image. And it has numerical and non-
numerical cell values. Because the table has headings and decorations,
such as shaded headings, shaded numbers. In fact, these tables were
taken out from Excel themselves, and it is just now they are displayed
on webpages and I have to put them back into Excel.
(3) Hopefully, I can throw the clipboard image into an OCR BMP image
to ASCII or EXCEL table converter.
(4) I copy the converted result into EXCEL and save it. I mean, we
don't care about the shaded headings, the decorations, etc. But all
the numbers should be there.


The fonts are pretty standard, as I said, the table themselves were
taken out from Excel themselves.


The this workflow repeats again and again.


Please point me to a *good OCR software which can be integrated into
my workflow and make it as fast as possible.


I know there are fancy OCR software available, but they use PDF or
other files as inputs, which requires me to capture the web tables and
save as files before using the OCR converter, which is very time
consuming.


I am looking for something goes directly from clipboard image and the
end result is clipboard ASCII /Excel/Text tables.


Thank you!


No it's a flash slides embedded in a webpage.... so your method might
not be the most appropriate one...