Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1
Default Automated Web Query / IQY / organizing results

Hello

I am new to this forum. I am also new on VBA. Thank you for you
indications that will help me find my way through.

I want to extract data from .htm files. All the files are in the sam
folder. Their names are "index-1.htm" to "index-6500.htm". (There ar
about 4500 files, meaning that some numbers are not used.) Th
interesting data are located in the middle of the html file, inside tw
tables.

Ultimately, I need the data inside a single excel spreadsheet, dat
from each page into a single line.

How can I do this?

Until now, I have try to use the query tool from, but I don't know ho
to configure it precisely.

The Data are held in two tables, 36 and 37.
My iqy files looks like this:
____________________________
WEB
1
file:///E:/JDN/index-23.htm

Selection=36,37
Formatting=None
PreFormattedTextToColumns=True
ConsecutiveDelimitersAsOne=True
SingleBlockTextImport=False
DisableDateRecognition=False
DisableRedirections=False
________________________________


I don't really know where to go from there. Could you help me validat
the different steps that I have planned and give me some hints to fin
my way?

1- From the final worksheet, launch a macro that opens the index-1.ht
file. If the record does not exist, it opens index-2.htm, and so on.

2- Import the data into a temp worksheet. (Where data can be process)

3- Erase unnecessary rows and cells. (i.e. the last rows of the secon
table needs to be erased)

4- Identify the data form the table depending on the title of the row
next to it (city, country, etc...). Typically, the second table ma
change in length: some records contain more fields than others
(business phone / home phone / Cell phone, etc.). When not used, the
don't appear on the .htm page.

At that point the data are more or less held in a table, 2 columns X 1
to 25 rows. The first column contains the type of data held in th
second column.

5- Moved the cells to a second table depending of the title next to it
Leave blank cells when the field has not been used.

6- Transpose the second column into a rows on the final sheet, clea
the temp sheet.

7- Close the index-2.htm files and try to open the index-3.htm file.

Done.

Is this a correct approach to the problem? Is there any other easie
way?

Also, the .htm files used to be on a web site. I used Teleport Pro t
download all of them, so it could be opened faster. Each .htm file
weight about 75Ko as there are many recursive info (top menu, lef
menu, roght menu, bottom menu) on each file. Do you know of an
automated solution to remove all the recursive lines ?


Your help is apreciated.

Best regards,
Bertran

--
Message posted from http://www.ExcelForum.com

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Reading results for automated Myers Briggs Type Indicator? Jetlag Excel Worksheet Functions 2 September 4th 06 11:07 PM
Web Query Sleep for until results come Sunil_Modi Excel Worksheet Functions 0 November 19th 04 04:29 PM
Set a PivotCache as the results of a query tod Excel Programming 2 February 16th 04 05:41 PM
MS Query Limit results Gunther[_2_] Excel Programming 2 December 16th 03 08:21 PM
Query Results Michael Burkett Excel Programming 1 November 11th 03 07:46 PM


All times are GMT +1. The time now is 05:05 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"