#1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 10
Default RegExp

Hi,

I am using Microsoft VBScript RegExp 5.5 within VBA for Excel.

http://msdn2.microsoft.com/en-us/lib...ka(VS.85).aspx

"Matches pattern and remembers the match. The matched substring can be
retrieved from the resulting Matches collection, using Item [0]...[n].
To match parentheses characters ( ), use "\(" or "\)"." - from the
docs on the above page.

I am wondering if anyone could give me an example of using the Item
array to retrieve clustered matches. Like so:

regEx.Pattern = ".*(\d{1,2})Y.*"

I need to match one or two digits followed by 'Y', but I only need to
get the digits. In regEx.Replace() I would use $1 to access the first
cluster (the match inside the innermost set of parens), but I am not
sure how to do it with regEx.Execute().

Thanks.
  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,651
Default RegExp

On Wed, 27 Feb 2008 08:14:46 -0800 (PST), Arshavir Grigorian
wrote:

Hi,

I am using Microsoft VBScript RegExp 5.5 within VBA for Excel.

http://msdn2.microsoft.com/en-us/lib...ka(VS.85).aspx

"Matches pattern and remembers the match. The matched substring can be
retrieved from the resulting Matches collection, using Item [0]...[n].
To match parentheses characters ( ), use "\(" or "\)"." - from the
docs on the above page.

I am wondering if anyone could give me an example of using the Item
array to retrieve clustered matches. Like so:

regEx.Pattern = ".*(\d{1,2})Y.*"

I need to match one or two digits followed by 'Y', but I only need to
get the digits. In regEx.Replace() I would use $1 to access the first
cluster (the match inside the innermost set of parens), but I am not
sure how to do it with regEx.Execute().

Thanks.



Something like this is one method that will return your group 1. Note the
SubMatches count is one-based; but the submatches collection is zero-based.

========================
Dim ResultString, myMatches as MatchCollection, myMatch As Match
Dim myRegExp As RegExp
Set myRegExp = New RegExp
myRegExp.Pattern = ".*(\d{1,2})Y.*"
Set myMatches = myRegExp.Execute(SubjectString)
If myMatches.Count = 1 Then
Set myMatch = myMatches(0)
If myMatch.SubMatches.Count = 1 Then
ResultString = myMatch.SubMatches(1-1)
Else
ResultString = ""
End If
Else
ResultString = ""
End If
==========================

However, your regex will not do what you want. It will only ever capture, into
group 1, a single digit followed by the Y.

See if you can figure out why, or look below for explanation:















Look up the difference between greedy and lazy quantifiers.













The first part of your regex ".*" says:

..*

Match any single character that is not a line break character «.*»
Between zero and unlimited times, as many times as possible, giving back as
needed (greedy) «*»

So, matching as much as it can, that will include all the digits except for the
single digit preceding the Y.


While you could certainly correct this by making the quantifier lazy (e.g.
".*?"), there's really no need for that at all.

You could accomplish your stated goal of capturing one or two digits, which are
followed by a Y, with the simpler regex:

"(\d{1,2})Y"

If you might have more than one such construct in a line, then with

Regex.global = true

myMatches.Count will give you the number of times that pattern was present in
the line.
--ron
  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 10
Default RegExp

Thanks, Ron. That's some elaborate code.

On a related note, the following regex is intended to capture "IN (23,
3454, 354)" or "IN (?)". However, it only captures "IN (23, 3454, 354"
in the first case and "?)" in the second case. Any ideas why?

"IN \(((\w+,?\s*)+)|(\?)\)"

I am accessing the match through

Set myMatch = myMatches(0)
MsgBox ("value: " & myMatch.Value)

By the way, "(IN \(\?\))|(IN \((\w\s*,?\s*)+\))" seems to work fine.





On Feb 27, 12:04*pm, Ron Rosenfeld wrote:
On Wed, 27 Feb 2008 08:14:46 -0800 (PST), Arshavir Grigorian





wrote:
Hi,


I am using Microsoft VBScript RegExp 5.5 within VBA for Excel.


http://msdn2.microsoft.com/en-us/lib...ka(VS.85).aspx


"Matches pattern and remembers the match. The matched substring can be
retrieved from the resulting Matches collection, using Item [0]...[n].
To match parentheses characters ( ), use "\(" or "\)"." - from the
docs on the above page.


I am wondering if anyone could give me an example of using the Item
array to retrieve clustered matches. Like so:


regEx.Pattern = ".*(\d{1,2})Y.*"


I need to match one or two digits followed by 'Y', but I only need to
get the digits. In regEx.Replace() I would use $1 to access the first
cluster (the match inside the innermost set of parens), but I am not
sure how to do it with regEx.Execute().


Thanks.


Something like this is one method that will return your group 1. *Note the
SubMatches count is one-based; but the submatches collection is zero-based..

========================
Dim ResultString, myMatches as MatchCollection, myMatch As Match
Dim myRegExp As RegExp
Set myRegExp = New RegExp
myRegExp.Pattern = ".*(\d{1,2})Y.*"
Set myMatches = myRegExp.Execute(SubjectString)
If myMatches.Count = 1 Then
* * * * Set myMatch = myMatches(0)
* * * * If myMatch.SubMatches.Count = 1 Then
* * * * * * * * ResultString = myMatch.SubMatches(1-1)
* * * * Else
* * * * * * * * ResultString = ""
* * * * End If
Else
* * * * ResultString = ""
End If
==========================

However, your regex will not do what you want. *It will only ever capture, into
group 1, a single digit followed by the Y.

See if you can figure out why, or look below for explanation:

Look up the difference between greedy and lazy quantifiers.

The first part of your regex *".*" says:

.*

Match any single character that is not a line break character «.*»
* *Between zero and unlimited times, as many times as possible, giving back as
needed (greedy) «*»

So, matching as much as it can, that will include all the digits except for the
single digit preceding the Y.

While you could certainly correct this by making the quantifier lazy (e.g.
".*?"), there's really no need for that at all.

You could accomplish your stated goal of capturing one or two digits, which are
followed by a Y, with the simpler regex:

"(\d{1,2})Y"

If you might have more than one such construct in a line, then with

Regex.global = true

myMatches.Count will give you the number of times that pattern was present in
the line.
--ron- Hide quoted text -

- Show quoted text -


Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
RegExp information Alain R. Excel Programming 1 February 18th 08 02:56 PM
RegExp - String in VBA MSK Excel Programming 0 July 24th 07 02:50 PM
Declare RegExp from Vbscript.dll Bob Phillips Excel Programming 0 January 16th 07 07:45 PM
How do you look for two distinct patterns (RegExp) ExcelMonkey Excel Programming 1 August 7th 05 11:29 PM
Regexp needed in Excel 2000 Mr. Me Excel Discussion (Misc queries) 14 August 2nd 05 08:01 PM


All times are GMT +1. The time now is 10:18 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"