Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 6
Default Regex techniques

I am still struggling with the use of the RegExp object in the VBScript_55
library, in Excel/VBA.

In my "old" regex processor, BKReplacem, I could use special characters such
as \n in the REPLACEMENT string, not just the search string. But that does
not seem to work in VBScript... I would get the literal "\n" in the output,
for example.

Perhaps I am trying to do things beyond this flavor of regex's capabilities.
I am trying to substantially reform and add information to multiple HTML
docs, using a set of search and replace rules stored in a spreadsheet for
ease of maintenance.

I note that I have never seen a VBS-regex example involving multiple lines
of text, but isn't that why they have the multi-line and global options?

Does anyone know of a source of VBS-regex documentation or examples that are
more oriented to text files, as opposed to field-validation applications?

I should mention that I am reading my text files in from a text stream
object into a single string variable, so that I can process the entire scope
of the document - line by line will not suffice for what I need to do.

To recap my questions:
1. Can I use special regex characters in the VBS replace string?
2. Can I process entire multi-line documents, and if so is the one-string
approach the right one in VBS?
3. Is there a source of documentation on more substantial multi-line VBS
regexs?

Thanks!
  #2   Report Post  
Posted to microsoft.public.excel.programming
Tim Tim is offline
external usenet poster
 
Posts: 145
Default Regex techniques

You should be able to use Chr(10) or vblf/vbcrlf instead of \n.

Tim

"Dave Runyan" wrote in message
...
I am still struggling with the use of the RegExp object in the VBScript_55
library, in Excel/VBA.

In my "old" regex processor, BKReplacem, I could use special characters
such
as \n in the REPLACEMENT string, not just the search string. But that
does
not seem to work in VBScript... I would get the literal "\n" in the
output,
for example.

Perhaps I am trying to do things beyond this flavor of regex's
capabilities.
I am trying to substantially reform and add information to multiple HTML
docs, using a set of search and replace rules stored in a spreadsheet for
ease of maintenance.

I note that I have never seen a VBS-regex example involving multiple lines
of text, but isn't that why they have the multi-line and global options?

Does anyone know of a source of VBS-regex documentation or examples that
are
more oriented to text files, as opposed to field-validation applications?

I should mention that I am reading my text files in from a text stream
object into a single string variable, so that I can process the entire
scope
of the document - line by line will not suffice for what I need to do.

To recap my questions:
1. Can I use special regex characters in the VBS replace string?
2. Can I process entire multi-line documents, and if so is the one-string
approach the right one in VBS?
3. Is there a source of documentation on more substantial multi-line VBS
regexs?

Thanks!



  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 6
Default Regex techniques

Thanks Tim, but I don't those VB expressions will be recognized by the RegExp
object. With any regex processor I have to use one of the "offical" regex
expressions like "\r\n" or "\x0d\x0a" or "\cM\cJ" which should all produce
the quivalent of "vbcrlf" in the output string. I could produce the
equivalent HTML control expressions, but I am trying to be able to interpret
and reproduce multi-line expressions, not just create a new line by
outputting (say) "<BR". The difference is subtle, but for what I am doing
it is important.

VBS regex is supposed to treat end-of-line characters as valid characters
that will match the ".", when multi-line mode is set true. Mine does not,
and I can't understand why.

For example, with multi-line option = TRUE,
the pattern ^<CENTER.*?</CENTER should match both:

<CENTERHello World</CENTER and

<CENTER
Hello World
</CENTER

BUT it only seems to match the first, one-line, string. I can't seem to
match multi-line strings no matter what I do.

"Tim" wrote:

You should be able to use Chr(10) or vblf/vbcrlf instead of \n.

Tim

"Dave Runyan" wrote in message
...
I am still struggling with the use of the RegExp object in the VBScript_55
library, in Excel/VBA.

In my "old" regex processor, BKReplacem, I could use special characters
such
as \n in the REPLACEMENT string, not just the search string. But that
does
not seem to work in VBScript... I would get the literal "\n" in the
output,
for example.

Perhaps I am trying to do things beyond this flavor of regex's
capabilities.
I am trying to substantially reform and add information to multiple HTML
docs, using a set of search and replace rules stored in a spreadsheet for
ease of maintenance.

I note that I have never seen a VBS-regex example involving multiple lines
of text, but isn't that why they have the multi-line and global options?

Does anyone know of a source of VBS-regex documentation or examples that
are
more oriented to text files, as opposed to field-validation applications?

I should mention that I am reading my text files in from a text stream
object into a single string variable, so that I can process the entire
scope
of the document - line by line will not suffice for what I need to do.

To recap my questions:
1. Can I use special regex characters in the VBS replace string?
2. Can I process entire multi-line documents, and if so is the one-string
approach the right one in VBS?
3. Is there a source of documentation on more substantial multi-line VBS
regexs?

Thanks!




  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,651
Default Regex techniques

On Fri, 27 Apr 2007 11:36:03 -0700, Dave Runyan
wrote:

Thanks Tim, but I don't those VB expressions will be recognized by the RegExp
object. With any regex processor I have to use one of the "offical" regex
expressions like "\r\n" or "\x0d\x0a" or "\cM\cJ" which should all produce
the quivalent of "vbcrlf" in the output string. I could produce the
equivalent HTML control expressions, but I am trying to be able to interpret
and reproduce multi-line expressions, not just create a new line by
outputting (say) "<BR". The difference is subtle, but for what I am doing
it is important.

VBS regex is supposed to treat end-of-line characters as valid characters
that will match the ".", when multi-line mode is set true. Mine does not,
and I can't understand why.

For example, with multi-line option = TRUE,
the pattern ^<CENTER.*?</CENTER should match both:

<CENTERHello World</CENTER and

<CENTER
Hello World
</CENTER

BUT it only seems to match the first, one-line, string. I can't seem to
match multi-line strings no matter what I do.


In VB

DOT "." does NOT match \n, regardless of how Multiline is set.

The Multiline property changes how ^ and $ are interpreted.

If you want to match all characters, including \n, you need to use something
like [\s\S]

So if you want to match both options, above, use the pattern:

^<CENTER[\s\S]*?</CENTER or
^<CENTER[\s\S]*</CENTER


--ron
  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 6
Default Regex techniques

Okay, that makes sense, but I still don't understand two things:

1. Why couldn't I explicitly match \r\n in the input text - I confirmed that
the input text stream contained crlf at the right point.
2. Can I use \r\n etc. in the REPLACE string to GENERATE those characters in
the result?

Thanks!

"Ron Rosenfeld" wrote:

On Fri, 27 Apr 2007 11:36:03 -0700, Dave Runyan
wrote:

Thanks Tim, but I don't those VB expressions will be recognized by the RegExp
object. With any regex processor I have to use one of the "offical" regex
expressions like "\r\n" or "\x0d\x0a" or "\cM\cJ" which should all produce
the quivalent of "vbcrlf" in the output string. I could produce the
equivalent HTML control expressions, but I am trying to be able to interpret
and reproduce multi-line expressions, not just create a new line by
outputting (say) "<BR". The difference is subtle, but for what I am doing
it is important.

VBS regex is supposed to treat end-of-line characters as valid characters
that will match the ".", when multi-line mode is set true. Mine does not,
and I can't understand why.

For example, with multi-line option = TRUE,
the pattern ^<CENTER.*?</CENTER should match both:

<CENTERHello World</CENTER and

<CENTER
Hello World
</CENTER

BUT it only seems to match the first, one-line, string. I can't seem to
match multi-line strings no matter what I do.


In VB

DOT "." does NOT match \n, regardless of how Multiline is set.

The Multiline property changes how ^ and $ are interpreted.

If you want to match all characters, including \n, you need to use something
like [\s\S]

So if you want to match both options, above, use the pattern:

^<CENTER[\s\S]*?</CENTER or
^<CENTER[\s\S]*</CENTER


--ron



  #6   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,651
Default Regex techniques

On Fri, 27 Apr 2007 15:42:01 -0700, Dave Runyan
wrote:

Okay, that makes sense, but I still don't understand two things:

1. Why couldn't I explicitly match \r\n in the input text - I confirmed that
the input text stream contained crlf at the right point.


Most likely, somewhere between the input text stream and the input to the VB
routine, the \r is getting stripped out.

2. Can I use \r\n etc. in the REPLACE string to GENERATE those characters in
the result?


I'm pretty certain you cannot, although others more knowledgeable about
VBScript may have a work around. I believe that the replace string can only
contain literals, or subexpressions.

You can use CHAR(10).

For example,

Given your source string:

<CENTERHello World</CENTER

and you want to insert \n as in your second example (before and after the
CENTER'd string, you could do something like this:

Pattern = "()(.*)?(</)"

Replace String = "$1"&CHAR(10)&"$2"&CHAR(10)&"$3"



Thanks!

"Ron Rosenfeld" wrote:

On Fri, 27 Apr 2007 11:36:03 -0700, Dave Runyan
wrote:

Thanks Tim, but I don't those VB expressions will be recognized by the RegExp
object. With any regex processor I have to use one of the "offical" regex
expressions like "\r\n" or "\x0d\x0a" or "\cM\cJ" which should all produce
the quivalent of "vbcrlf" in the output string. I could produce the
equivalent HTML control expressions, but I am trying to be able to interpret
and reproduce multi-line expressions, not just create a new line by
outputting (say) "<BR". The difference is subtle, but for what I am doing
it is important.

VBS regex is supposed to treat end-of-line characters as valid characters
that will match the ".", when multi-line mode is set true. Mine does not,
and I can't understand why.

For example, with multi-line option = TRUE,
the pattern ^<CENTER.*?</CENTER should match both:

<CENTERHello World</CENTER and

<CENTER
Hello World
</CENTER

BUT it only seems to match the first, one-line, string. I can't seem to
match multi-line strings no matter what I do.


In VB

DOT "." does NOT match \n, regardless of how Multiline is set.

The Multiline property changes how ^ and $ are interpreted.

If you want to match all characters, including \n, you need to use something
like [\s\S]

So if you want to match both options, above, use the pattern:

^<CENTER[\s\S]*?</CENTER or
^<CENTER[\s\S]*</CENTER


--ron


--ron
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Relational Techniques... there's got to be a better way Mike[_10_] New Users to Excel 0 November 4th 08 06:02 PM
Addressing Techniques virendra Excel Discussion (Misc queries) 2 December 27th 06 07:35 AM
Box-Jenkins techniques & Time Series karim Excel Worksheet Functions 3 October 2nd 05 10:28 PM
RegEx to parse something like this... R Avery Excel Programming 2 March 7th 05 06:41 PM
Regex Question William Barnes Excel Programming 5 January 2nd 04 11:57 AM


All times are GMT +1. The time now is 09:48 PM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"