Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
I am still struggling with the use of the RegExp object in the VBScript_55
library, in Excel/VBA. In my "old" regex processor, BKReplacem, I could use special characters such as \n in the REPLACEMENT string, not just the search string. But that does not seem to work in VBScript... I would get the literal "\n" in the output, for example. Perhaps I am trying to do things beyond this flavor of regex's capabilities. I am trying to substantially reform and add information to multiple HTML docs, using a set of search and replace rules stored in a spreadsheet for ease of maintenance. I note that I have never seen a VBS-regex example involving multiple lines of text, but isn't that why they have the multi-line and global options? Does anyone know of a source of VBS-regex documentation or examples that are more oriented to text files, as opposed to field-validation applications? I should mention that I am reading my text files in from a text stream object into a single string variable, so that I can process the entire scope of the document - line by line will not suffice for what I need to do. To recap my questions: 1. Can I use special regex characters in the VBS replace string? 2. Can I process entire multi-line documents, and if so is the one-string approach the right one in VBS? 3. Is there a source of documentation on more substantial multi-line VBS regexs? Thanks! |
#2
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
You should be able to use Chr(10) or vblf/vbcrlf instead of \n.
Tim "Dave Runyan" wrote in message ... I am still struggling with the use of the RegExp object in the VBScript_55 library, in Excel/VBA. In my "old" regex processor, BKReplacem, I could use special characters such as \n in the REPLACEMENT string, not just the search string. But that does not seem to work in VBScript... I would get the literal "\n" in the output, for example. Perhaps I am trying to do things beyond this flavor of regex's capabilities. I am trying to substantially reform and add information to multiple HTML docs, using a set of search and replace rules stored in a spreadsheet for ease of maintenance. I note that I have never seen a VBS-regex example involving multiple lines of text, but isn't that why they have the multi-line and global options? Does anyone know of a source of VBS-regex documentation or examples that are more oriented to text files, as opposed to field-validation applications? I should mention that I am reading my text files in from a text stream object into a single string variable, so that I can process the entire scope of the document - line by line will not suffice for what I need to do. To recap my questions: 1. Can I use special regex characters in the VBS replace string? 2. Can I process entire multi-line documents, and if so is the one-string approach the right one in VBS? 3. Is there a source of documentation on more substantial multi-line VBS regexs? Thanks! |
#3
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Thanks Tim, but I don't those VB expressions will be recognized by the RegExp
object. With any regex processor I have to use one of the "offical" regex expressions like "\r\n" or "\x0d\x0a" or "\cM\cJ" which should all produce the quivalent of "vbcrlf" in the output string. I could produce the equivalent HTML control expressions, but I am trying to be able to interpret and reproduce multi-line expressions, not just create a new line by outputting (say) "<BR". The difference is subtle, but for what I am doing it is important. VBS regex is supposed to treat end-of-line characters as valid characters that will match the ".", when multi-line mode is set true. Mine does not, and I can't understand why. For example, with multi-line option = TRUE, the pattern ^<CENTER.*?</CENTER should match both: <CENTERHello World</CENTER and <CENTER Hello World </CENTER BUT it only seems to match the first, one-line, string. I can't seem to match multi-line strings no matter what I do. "Tim" wrote: You should be able to use Chr(10) or vblf/vbcrlf instead of \n. Tim "Dave Runyan" wrote in message ... I am still struggling with the use of the RegExp object in the VBScript_55 library, in Excel/VBA. In my "old" regex processor, BKReplacem, I could use special characters such as \n in the REPLACEMENT string, not just the search string. But that does not seem to work in VBScript... I would get the literal "\n" in the output, for example. Perhaps I am trying to do things beyond this flavor of regex's capabilities. I am trying to substantially reform and add information to multiple HTML docs, using a set of search and replace rules stored in a spreadsheet for ease of maintenance. I note that I have never seen a VBS-regex example involving multiple lines of text, but isn't that why they have the multi-line and global options? Does anyone know of a source of VBS-regex documentation or examples that are more oriented to text files, as opposed to field-validation applications? I should mention that I am reading my text files in from a text stream object into a single string variable, so that I can process the entire scope of the document - line by line will not suffice for what I need to do. To recap my questions: 1. Can I use special regex characters in the VBS replace string? 2. Can I process entire multi-line documents, and if so is the one-string approach the right one in VBS? 3. Is there a source of documentation on more substantial multi-line VBS regexs? Thanks! |
#4
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On Fri, 27 Apr 2007 11:36:03 -0700, Dave Runyan
wrote: Thanks Tim, but I don't those VB expressions will be recognized by the RegExp object. With any regex processor I have to use one of the "offical" regex expressions like "\r\n" or "\x0d\x0a" or "\cM\cJ" which should all produce the quivalent of "vbcrlf" in the output string. I could produce the equivalent HTML control expressions, but I am trying to be able to interpret and reproduce multi-line expressions, not just create a new line by outputting (say) "<BR". The difference is subtle, but for what I am doing it is important. VBS regex is supposed to treat end-of-line characters as valid characters that will match the ".", when multi-line mode is set true. Mine does not, and I can't understand why. For example, with multi-line option = TRUE, the pattern ^<CENTER.*?</CENTER should match both: <CENTERHello World</CENTER and <CENTER Hello World </CENTER BUT it only seems to match the first, one-line, string. I can't seem to match multi-line strings no matter what I do. In VB DOT "." does NOT match \n, regardless of how Multiline is set. The Multiline property changes how ^ and $ are interpreted. If you want to match all characters, including \n, you need to use something like [\s\S] So if you want to match both options, above, use the pattern: ^<CENTER[\s\S]*?</CENTER or ^<CENTER[\s\S]*</CENTER --ron |
#5
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Okay, that makes sense, but I still don't understand two things:
1. Why couldn't I explicitly match \r\n in the input text - I confirmed that the input text stream contained crlf at the right point. 2. Can I use \r\n etc. in the REPLACE string to GENERATE those characters in the result? Thanks! "Ron Rosenfeld" wrote: On Fri, 27 Apr 2007 11:36:03 -0700, Dave Runyan wrote: Thanks Tim, but I don't those VB expressions will be recognized by the RegExp object. With any regex processor I have to use one of the "offical" regex expressions like "\r\n" or "\x0d\x0a" or "\cM\cJ" which should all produce the quivalent of "vbcrlf" in the output string. I could produce the equivalent HTML control expressions, but I am trying to be able to interpret and reproduce multi-line expressions, not just create a new line by outputting (say) "<BR". The difference is subtle, but for what I am doing it is important. VBS regex is supposed to treat end-of-line characters as valid characters that will match the ".", when multi-line mode is set true. Mine does not, and I can't understand why. For example, with multi-line option = TRUE, the pattern ^<CENTER.*?</CENTER should match both: <CENTERHello World</CENTER and <CENTER Hello World </CENTER BUT it only seems to match the first, one-line, string. I can't seem to match multi-line strings no matter what I do. In VB DOT "." does NOT match \n, regardless of how Multiline is set. The Multiline property changes how ^ and $ are interpreted. If you want to match all characters, including \n, you need to use something like [\s\S] So if you want to match both options, above, use the pattern: ^<CENTER[\s\S]*?</CENTER or ^<CENTER[\s\S]*</CENTER --ron |
#6
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On Fri, 27 Apr 2007 15:42:01 -0700, Dave Runyan
wrote: Okay, that makes sense, but I still don't understand two things: 1. Why couldn't I explicitly match \r\n in the input text - I confirmed that the input text stream contained crlf at the right point. Most likely, somewhere between the input text stream and the input to the VB routine, the \r is getting stripped out. 2. Can I use \r\n etc. in the REPLACE string to GENERATE those characters in the result? I'm pretty certain you cannot, although others more knowledgeable about VBScript may have a work around. I believe that the replace string can only contain literals, or subexpressions. You can use CHAR(10). For example, Given your source string: <CENTERHello World</CENTER and you want to insert \n as in your second example (before and after the CENTER'd string, you could do something like this: Pattern = "()(.*)?(</)" Replace String = "$1"&CHAR(10)&"$2"&CHAR(10)&"$3" Thanks! "Ron Rosenfeld" wrote: On Fri, 27 Apr 2007 11:36:03 -0700, Dave Runyan wrote: Thanks Tim, but I don't those VB expressions will be recognized by the RegExp object. With any regex processor I have to use one of the "offical" regex expressions like "\r\n" or "\x0d\x0a" or "\cM\cJ" which should all produce the quivalent of "vbcrlf" in the output string. I could produce the equivalent HTML control expressions, but I am trying to be able to interpret and reproduce multi-line expressions, not just create a new line by outputting (say) "<BR". The difference is subtle, but for what I am doing it is important. VBS regex is supposed to treat end-of-line characters as valid characters that will match the ".", when multi-line mode is set true. Mine does not, and I can't understand why. For example, with multi-line option = TRUE, the pattern ^<CENTER.*?</CENTER should match both: <CENTERHello World</CENTER and <CENTER Hello World </CENTER BUT it only seems to match the first, one-line, string. I can't seem to match multi-line strings no matter what I do. In VB DOT "." does NOT match \n, regardless of how Multiline is set. The Multiline property changes how ^ and $ are interpreted. If you want to match all characters, including \n, you need to use something like [\s\S] So if you want to match both options, above, use the pattern: ^<CENTER[\s\S]*?</CENTER or ^<CENTER[\s\S]*</CENTER --ron --ron |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Relational Techniques... there's got to be a better way | New Users to Excel | |||
Addressing Techniques | Excel Discussion (Misc queries) | |||
Box-Jenkins techniques & Time Series | Excel Worksheet Functions | |||
RegEx to parse something like this... | Excel Programming | |||
Regex Question | Excel Programming |