Home |
Search |
Today's Posts |
#1
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
I'm parsing an HTML file, and originally, I thought I only needed to capture
all the links- the following worked well in my particular application (sample HTML snippet pasted at bottom of post): ^<A HREF=.* However, now I've found that I only need to capture and process certain links. The information that will determine whether a link needs to be processed is buried between the original link and the next link (or EOF), so I need to capture a larger (multiline) section of text and test each one to see if it contains my identifier. It appears that I'm safe using the </TR tag as something that always comes after my new identifier and before the next link (or EOF). So, I'm trying to edit my regex so I can grab this larger (multiline) section of text, then if the identifier is the correct one, I'll use my first regex expression or a slightly modified version to grab just the URL from within the match. I've been using http://www.aivosto.com/vbtips/regex.html as a helpful source on regex expressions, but when I test my code on http://regexlib.com/RETester.aspx I'm getting no results (my first expression worked fine). Any assistance would be greatly appreciated. I think I'm pretty close, but the following isn't working: ^<A HREF=.*/TR Any advice? The only difference is replacing the single '' with '/TR'. I suspect it may have to do with spaces or linebreaks, but I don't know for certain. I'm posting a sample of my much larger HTML below; I'm trying to only capture the ^<A HREF=.* URL match for items where the class td includes "Land Spread Vector". I prefer using multiple simple Regex expressions versus one donated expression that does it all, so I can understand my own code and at least attempt to troubleshoot if I need to change anything. Thanks! Keith <A Href=javascript:openDocument('0900043d802b3528'); <img src=/OurDir/images/formats/f_msw8_16.gif border=0 align=left width=16 101998 </a </td <td class='classtd' Green-tipped Martin </td <td class='classtd' CURRENT,3.2 </td </TR <TR <TD</TD <TD <A Href=javascript:openDocument('0900043d803a1ce4'); <img src=/OurDir/images/formats/f_msw8_16.gif border=0 align=left width=16 101998 - APRRE - Assert.doc </a </td <td class='classtd' Land Spread Vector </td <td class='classtd' CURRENT,3.0 </td </TR <TR <TD</TD <TD <A Href=javascript:openDocument('0900043d802b635e'); <img src=/OurDir/images/formats/f_msw8_16.gif border=0 align=left width=16 101998-R </a </td <td class='classtd' Reevaluation </td <td class='classtd' CURRENT,1.0 </td </TR </TD</TR</TABLE<BR<BR <CENTER <A Href='javascript:history.back();'<img src='/OurDir/images/back_down.jpg' border=0 align='center' alt='Back'</A <A Href='javascript:goHome();'<img src='/OurDir/images/home_down.jpg' border=0 align='center' alt='Home'</A </CENTER </BODY </HTML |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
application.match with multi-dimensional arrays (syntax request) | Excel Programming | |||
Help with a Regex Pattern | Excel Programming | |||
Regex techniques | Excel Programming | |||
RegEx to parse something like this... | Excel Programming | |||
Regex Question | Excel Programming |