Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 37
Default Scraping/listing document URLs on a server that don't have web pages/existing links?

We have a server-side database that includes BLOBs of MS word documents.
Some of those documents have always been available via URL hyperlinks on an
intranet web page. We've asked the administrator to expose a new group of
documents from that database, although the links to those newly exposed
documents won't be built into the existing web interface (they should still
be available via individual intranet URLs).

As the first step in a new project, I'd like to scrape all of those document
URLs. I assume it would be something like a recursive tree search, since
there is a heirarchical order of the documents in the existing web
interface- so for example,
/WPSAC/1-1000/356/356.doc
/WPSAC/1-1000/781/781.doc
/WPSAC/1001-2000/2294/2294.doc
/WPSAC/1001-2000/2770-2790/specials/2776.doc
/WPSAC/Revised/single_entry/B438.doc
etc.

I haven't worked with web stuff at all (although I'm decent with VBA)- any
pointers on where to start, to build a list in Excel where each sequential
cell contains a link to the next word document?

My next step on the project will be to build a user interface so a user can
select a document number and have it automatically load the link, but I need
to get the links themselves first.

Thanks!
Keith


  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 37
Default Scraping/listing document URLs on a server that don't have web pages/existing links?

Sorry, I forgot to mention I'm using XL2003 on WinXP.

"Keith R" wrote in message
...
We have a server-side database that includes BLOBs of MS word documents.
Some of those documents have always been available via URL hyperlinks on
an intranet web page. We've asked the administrator to expose a new group
of documents from that database, although the links to those newly exposed
documents won't be built into the existing web interface (they should
still be available via individual intranet URLs).

As the first step in a new project, I'd like to scrape all of those
document URLs. I assume it would be something like a recursive tree
search, since there is a heirarchical order of the documents in the
existing web interface- so for example,
/WPSAC/1-1000/356/356.doc
/WPSAC/1-1000/781/781.doc
/WPSAC/1001-2000/2294/2294.doc
/WPSAC/1001-2000/2770-2790/specials/2776.doc
/WPSAC/Revised/single_entry/B438.doc
etc.

I haven't worked with web stuff at all (although I'm decent with VBA)- any
pointers on where to start, to build a list in Excel where each sequential
cell contains a link to the next word document?

My next step on the project will be to build a user interface so a user
can select a document number and have it automatically load the link, but
I need to get the links themselves first.

Thanks!
Keith



  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1,588
Default Scraping/listing document URLs on a server that don't have web pages/existing links?

Keith,

From your description it sounds as though the docs are kept in a database
(which flavour?) and not on the web server filesystem. If this is the case
then you won't be able to "scrape" the URL's: there is no "Dir()" equivalent
in this case.

If web links aren't to be created for the new docs it's not clear how
they're being exposed to you. If they're in a database then potentially you
could use something like ADO to search and index the docs from Excel. Even
then, creating a hyperlink clickable in XL would require some type of
scripting set up on the server to deliver the requested file from the DB
table.

Tim


"Keith R" wrote in message
...
We have a server-side database that includes BLOBs of MS word documents.
Some of those documents have always been available via URL hyperlinks on
an intranet web page. We've asked the administrator to expose a new group
of documents from that database, although the links to those newly exposed
documents won't be built into the existing web interface (they should
still be available via individual intranet URLs).

As the first step in a new project, I'd like to scrape all of those
document URLs. I assume it would be something like a recursive tree
search, since there is a heirarchical order of the documents in the
existing web interface- so for example,
/WPSAC/1-1000/356/356.doc
/WPSAC/1-1000/781/781.doc
/WPSAC/1001-2000/2294/2294.doc
/WPSAC/1001-2000/2770-2790/specials/2776.doc
/WPSAC/Revised/single_entry/B438.doc
etc.

I haven't worked with web stuff at all (although I'm decent with VBA)- any
pointers on where to start, to build a list in Excel where each sequential
cell contains a link to the next word document?

My next step on the project will be to build a user interface so a user
can select a document number and have it automatically load the link, but
I need to get the links themselves first.

Thanks!
Keith



Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
In Excel using URLs, how to display pictures stored on a server? jwpowers Excel Discussion (Misc queries) 1 August 20th 07 10:50 PM
listing server of demain in sheet sal21 Excel Programming 1 March 22nd 07 10:15 AM
Convert urls to links ellen Excel Discussion (Misc queries) 2 October 5th 06 12:25 PM
Inserting links/URLs into cell w/ other text adam Excel Discussion (Misc queries) 1 February 21st 06 08:30 PM
SQL Server Listing and SQL Databases YG Lim Excel Programming 1 April 14th 05 09:21 AM


All times are GMT +1. The time now is 09:00 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"