LinkBack Thread Tools Search this Thread Display Modes
Prev Previous Post   Next Post Next
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 10
Default VBA Regular Expressions & URL Extraction

Greetings all,

I am trying to extract the URLs of a set of animated movies off
various sites using regular expressions and then dump those URLs into
an Excel document (via VBA). I have a decent grasp of regex but I
have hit a brick wall lately with a particular site. I have
experimented with a number of patterns but cannot yet get the correct
result.

The expected result is:
/site/olspage.jsp?skuId=8936896&st=Transformers+Wide screen&type=product&id=1754542


However, if I do get a non-null result back, it is usually:
http://www.bestbuy.com/site/olspage....ry&id=cat00000



---------------------- Sample Patterns Tested:
----------------------
..Pattern = "\<a\s+href=\W?(.*?)\W?\s?class=\W?prodlink\W? "
..Pattern = "\<a\s+href=""([A-Za-z0-9/;&\.\?\+-=]+)""\s+class"
..Pattern = "\<a\s+href=\W?(.*?)\W?\s?class=\W?\w\W?"



---------------------- Partial Source Data (from website):
----------------------

<div class="logo"
<a href="http://www.bestbuy.com/site/olspage.jsp?
type=category&id=cat00000" name="&lid=hdr_logo"<img src="http://
images.bestbuy.com:80/BestBuy_US/en_US/images/global/header/logo.gif"
alt="Best Buy Logo"/</a
</div

<td class="skucontent"

<a href="/site/olspage.jsp?skuId=8936896&amp;st=Transformers
+Widescreen&amp;type=product&amp;id=1754542" class="prodlink"
Transformers - Widescreen Dubbed Subtitle AC3</a<br/

---------------------- ---------------------- ----------------------

I'm most interested in utilizing the [class="prodlink"] string as this
is the tag that labels a movie URL. I know that regex in VBA can be a
bit tricky owing to the use of double quotes and other non-alpha
characters, but can any of you guys spot what I'm doing wrong? Thanks
for your help!

 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Get rid of with regular expressions Howdy Excel Discussion (Misc queries) 1 January 18th 10 07:42 PM
Regular expressions Dave Excel Programming 5 September 20th 07 02:46 PM
Using Regular Expressions with VBA Andrew Hall NZ Excel Programming 5 November 21st 06 09:30 PM
Regular expressions JeffMelton Excel Programming 2 March 1st 06 12:52 AM
VBA and Regular expressions Friedrich Muecke Excel Programming 3 October 3rd 03 01:46 AM


All times are GMT +1. The time now is 06:15 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"