View Single Post
  #3   Report Post  
Posted to microsoft.public.excel.programming
Dave Runyan Dave Runyan is offline
external usenet poster
 
Posts: 6
Default Regex Capture problem

Here are two regex search and replace strings. The first does works, but the
second does not:

Search for = "<TITLEPage-1</TITLE"
Replace with = "<TITLEStaffing Process Model</TITLE"


This did not do any capture or replace:
Search for = "<CENTER\n([^\n]*)_DPM Model - (.*)HREF="([^\n]*).html"
Replace with = "<CENTER\n\1_DPM Model - \2HREF= "\1_DPM Model - \3.html"

Here are lines in the html- first 3 show where I want to pick-up the
extended file name which is "STAFFING_VISION-REF_2007-03-31_DPM_Model_-_"

<CENTER
<IMG SRC="STAFFING_VISION-REF_2007-03-31_DPM_Model_-_Talent_acquired.jpg"
ALT="" BORDER="0" USEMAP="#image_map"
</CENTER

Here is where I want to insert the extended filename in front of the "Talent
Need defined.html". The stuff after ".html" will be processed and removed in
later steps:

<AREA SHAPE="CIRCLE" COORDS="103,395,44" HREF="Talent Need defined.html,
1UP: Toplevel"

Here are the variables as sent to the function:
strRgxSearchPattern = "<CENTER\n\1_DPM Model - \2HREF= "\1_DPM Model -
\3.html"
strRgxInput = "<CENTER\n([^\n]*)_DPM Model - (.*)HREF="([^\n]*).html"


Here is the function:
Private Function RgxReplaceText(strRgxSearchPattern, strRgxInput,
strRgxOutput) As String
Dim objRgx As New RegExp
objRgx.IgnoreCase = False ' Set case sensitive.
objRgx.Global = True ' Set replace all.
objRgx.MultiLine = True ' Set multiline mode.
objRgx.pattern = strRgxSearchPattern
RgxReplaceText = objRgx.Replace(strRgxInput, strRgxOutput)
End Function

"Ron Rosenfeld" wrote:

On Wed, 25 Apr 2007 08:56:04 -0700, Dave Runyan
wrote:

I am using the VBScript_RegExp_55 library in an Excel-based VBA project to
build a text-file processor that uses rules stored in an Excel spreadsheet.
I have used Regex utilities before, so I understand the concepts of text
capture and re-use in a Regex expression.

But when I try to capture text with (*), for example, and then try to re-use
it with \1, I get either no changes made, or an object-generated error 5081.
I have not been able to isolate when these two results occur.

I believe my program code is basically correct, because I can do search and
replace type stuff just fine, including in the same "run" as the capture
operations that fail. Here are my options settings:

objRgx.IgnoreCase = False ' Set case sensitive.
objRgx.Global = True ' Set replace all.
objRgx.MultiLine = True ' Set multiline mode.

Thanks for any help you can provide.


How about giving an example of your program code, input, and desired output?
--ron