Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Hi,
I am using Microsoft VBScript RegExp 5.5 within VBA for Excel. http://msdn2.microsoft.com/en-us/lib...ka(VS.85).aspx "Matches pattern and remembers the match. The matched substring can be retrieved from the resulting Matches collection, using Item [0]...[n]. To match parentheses characters ( ), use "\(" or "\)"." - from the docs on the above page. I am wondering if anyone could give me an example of using the Item array to retrieve clustered matches. Like so: regEx.Pattern = ".*(\d{1,2})Y.*" I need to match one or two digits followed by 'Y', but I only need to get the digits. In regEx.Replace() I would use $1 to access the first cluster (the match inside the innermost set of parens), but I am not sure how to do it with regEx.Execute(). Thanks. |
#2
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On Wed, 27 Feb 2008 08:14:46 -0800 (PST), Arshavir Grigorian
wrote: Hi, I am using Microsoft VBScript RegExp 5.5 within VBA for Excel. http://msdn2.microsoft.com/en-us/lib...ka(VS.85).aspx "Matches pattern and remembers the match. The matched substring can be retrieved from the resulting Matches collection, using Item [0]...[n]. To match parentheses characters ( ), use "\(" or "\)"." - from the docs on the above page. I am wondering if anyone could give me an example of using the Item array to retrieve clustered matches. Like so: regEx.Pattern = ".*(\d{1,2})Y.*" I need to match one or two digits followed by 'Y', but I only need to get the digits. In regEx.Replace() I would use $1 to access the first cluster (the match inside the innermost set of parens), but I am not sure how to do it with regEx.Execute(). Thanks. Something like this is one method that will return your group 1. Note the SubMatches count is one-based; but the submatches collection is zero-based. ======================== Dim ResultString, myMatches as MatchCollection, myMatch As Match Dim myRegExp As RegExp Set myRegExp = New RegExp myRegExp.Pattern = ".*(\d{1,2})Y.*" Set myMatches = myRegExp.Execute(SubjectString) If myMatches.Count = 1 Then Set myMatch = myMatches(0) If myMatch.SubMatches.Count = 1 Then ResultString = myMatch.SubMatches(1-1) Else ResultString = "" End If Else ResultString = "" End If ========================== However, your regex will not do what you want. It will only ever capture, into group 1, a single digit followed by the Y. See if you can figure out why, or look below for explanation: Look up the difference between greedy and lazy quantifiers. The first part of your regex ".*" says: ..* Match any single character that is not a line break character «.*» Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*» So, matching as much as it can, that will include all the digits except for the single digit preceding the Y. While you could certainly correct this by making the quantifier lazy (e.g. ".*?"), there's really no need for that at all. You could accomplish your stated goal of capturing one or two digits, which are followed by a Y, with the simpler regex: "(\d{1,2})Y" If you might have more than one such construct in a line, then with Regex.global = true myMatches.Count will give you the number of times that pattern was present in the line. --ron |
#3
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Thanks, Ron. That's some elaborate code.
On a related note, the following regex is intended to capture "IN (23, 3454, 354)" or "IN (?)". However, it only captures "IN (23, 3454, 354" in the first case and "?)" in the second case. Any ideas why? "IN \(((\w+,?\s*)+)|(\?)\)" I am accessing the match through Set myMatch = myMatches(0) MsgBox ("value: " & myMatch.Value) By the way, "(IN \(\?\))|(IN \((\w\s*,?\s*)+\))" seems to work fine. On Feb 27, 12:04*pm, Ron Rosenfeld wrote: On Wed, 27 Feb 2008 08:14:46 -0800 (PST), Arshavir Grigorian wrote: Hi, I am using Microsoft VBScript RegExp 5.5 within VBA for Excel. http://msdn2.microsoft.com/en-us/lib...ka(VS.85).aspx "Matches pattern and remembers the match. The matched substring can be retrieved from the resulting Matches collection, using Item [0]...[n]. To match parentheses characters ( ), use "\(" or "\)"." - from the docs on the above page. I am wondering if anyone could give me an example of using the Item array to retrieve clustered matches. Like so: regEx.Pattern = ".*(\d{1,2})Y.*" I need to match one or two digits followed by 'Y', but I only need to get the digits. In regEx.Replace() I would use $1 to access the first cluster (the match inside the innermost set of parens), but I am not sure how to do it with regEx.Execute(). Thanks. Something like this is one method that will return your group 1. *Note the SubMatches count is one-based; but the submatches collection is zero-based.. ======================== Dim ResultString, myMatches as MatchCollection, myMatch As Match Dim myRegExp As RegExp Set myRegExp = New RegExp myRegExp.Pattern = ".*(\d{1,2})Y.*" Set myMatches = myRegExp.Execute(SubjectString) If myMatches.Count = 1 Then * * * * Set myMatch = myMatches(0) * * * * If myMatch.SubMatches.Count = 1 Then * * * * * * * * ResultString = myMatch.SubMatches(1-1) * * * * Else * * * * * * * * ResultString = "" * * * * End If Else * * * * ResultString = "" End If ========================== However, your regex will not do what you want. *It will only ever capture, into group 1, a single digit followed by the Y. See if you can figure out why, or look below for explanation: Look up the difference between greedy and lazy quantifiers. The first part of your regex *".*" says: .* Match any single character that is not a line break character «.*» * *Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*» So, matching as much as it can, that will include all the digits except for the single digit preceding the Y. While you could certainly correct this by making the quantifier lazy (e.g. ".*?"), there's really no need for that at all. You could accomplish your stated goal of capturing one or two digits, which are followed by a Y, with the simpler regex: "(\d{1,2})Y" If you might have more than one such construct in a line, then with Regex.global = true myMatches.Count will give you the number of times that pattern was present in the line. --ron- Hide quoted text - - Show quoted text - |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
RegExp information | Excel Programming | |||
RegExp - String in VBA | Excel Programming | |||
Declare RegExp from Vbscript.dll | Excel Programming | |||
How do you look for two distinct patterns (RegExp) | Excel Programming | |||
Regexp needed in Excel 2000 | Excel Discussion (Misc queries) |