ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Impossible with regular expressions? (https://www.excelbanter.com/excel-programming/444485-impossible-regular-expressions.html)

Robert Crandal[_2_]

Impossible with regular expressions?
 
Hi everyone. So I think I finally have a better understanding of
how to use regular expressions in VBA, but I have a more
difficult question.

I am working with strings that basically contain random sets
of character strings (or "tokens") that are separated by any
number of whitespace characters. Here are some example
strings that I might encounter:

* "The age of the dog is 100 years!!!"
* "aa bb cc dd ee 01 10 111 ooo"
* " Here is a sting that contains a total of 13 tokens... got it?"

I am trying to develop a regex pattern string that will let me enumerate
or collect each of the tokens in the string. I would like to use the
Submatches() function to retrieve any "token" in the string by index.

So, using the third string above as an example, I would want
the Submatches() function to return the following:

.Submatches(0) = "Here"
.Submatches(1) = "is"
.Submatches(2)= "a"
.Submatches(3)= "string"
.Submatches(4)= "that"
etc... etc....
.Submatches(12) = "it?"

I hope my question makes sense. I just need help with the
pattern string. So far, the only thing I could think of is the
following:

"(\s+(\S+)\s*)+" ?????

Got any ideas?

Robert Crandal





isabelle

Impossible with regular expressions?
 
hi,

there's an example he
http://code.dunae.ca/name_case/
I hope this link will be helping you


--
isabelle

Le 2011-04-23 19:03, Robert Crandal a écrit :
Hi everyone. So I think I finally have a better understanding of
how to use regular expressions in VBA, but I have a more
difficult question.

I am working with strings that basically contain random sets
of character strings (or "tokens") that are separated by any
number of whitespace characters. Here are some example
strings that I might encounter:

* "The age of the dog is 100 years!!!"
* "aa bb cc dd ee 01 10 111 ooo"
* " Here is a sting that contains a total of 13 tokens... got it?"

I am trying to develop a regex pattern string that will let me enumerate
or collect each of the tokens in the string. I would like to use the Submatches() function to retrieve any "token" in the string by index.

So, using the third string above as an example, I would want
the Submatches() function to return the following:

.Submatches(0) = "Here"
.Submatches(1) = "is"
.Submatches(2)= "a"
.Submatches(3)= "string"
.Submatches(4)= "that"
etc... etc....
.Submatches(12) = "it?"

I hope my question makes sense. I just need help with the
pattern string. So far, the only thing I could think of is the
following:

"(\s+(\S+)\s*)+" ?????

Got any ideas?

Robert Crandal





Ron Rosenfeld[_2_]

Impossible with regular expressions?
 
On Sat, 23 Apr 2011 16:03:14 -0700, "Robert Crandal" wrote:

Hi everyone. So I think I finally have a better understanding of
how to use regular expressions in VBA, but I have a more
difficult question.

I am working with strings that basically contain random sets
of character strings (or "tokens") that are separated by any
number of whitespace characters. Here are some example
strings that I might encounter:

* "The age of the dog is 100 years!!!"
* "aa bb cc dd ee 01 10 111 ooo"
* " Here is a sting that contains a total of 13 tokens... got it?"

I am trying to develop a regex pattern string that will let me enumerate
or collect each of the tokens in the string. I would like to use the
Submatches() function to retrieve any "token" in the string by index.

So, using the third string above as an example, I would want
the Submatches() function to return the following:

.Submatches(0) = "Here"
.Submatches(1) = "is"
.Submatches(2)= "a"
.Submatches(3)= "string"
.Submatches(4)= "that"
etc... etc....
.Submatches(12) = "it?"

I hope my question makes sense. I just need help with the
pattern string. So far, the only thing I could think of is the
following:

"(\s+(\S+)\s*)+" ?????

Got any ideas?

Robert Crandal




I don't understand why you want to use submatches.

The simplest is usually the most efficient, and I would think the simplest regex, to collect strings separated by spaces, would be "\S+". Your collection, instead of being submatches, would be matches.

e.g:
===================
Option Explicit
Function foo(s As String)
Dim re As Object, mc As Object
Const sPat As String = "\S+"
Set re = CreateObject("vbscript.regexp")
re.Global = True
re.Pattern = sPat
Set mc = re.Execute(s)

End Function

Ron Rosenfeld[_2_]

Impossible with regular expressions?
 
On Sat, 23 Apr 2011 23:10:03 -0400, Ron Rosenfeld wrote:

On Sat, 23 Apr 2011 16:03:14 -0700, "Robert Crandal" wrote:

Hi everyone. So I think I finally have a better understanding of
how to use regular expressions in VBA, but I have a more
difficult question.

I am working with strings that basically contain random sets
of character strings (or "tokens") that are separated by any
number of whitespace characters. Here are some example
strings that I might encounter:

* "The age of the dog is 100 years!!!"
* "aa bb cc dd ee 01 10 111 ooo"
* " Here is a sting that contains a total of 13 tokens... got it?"

I am trying to develop a regex pattern string that will let me enumerate
or collect each of the tokens in the string. I would like to use the
Submatches() function to retrieve any "token" in the string by index.

So, using the third string above as an example, I would want
the Submatches() function to return the following:

.Submatches(0) = "Here"
.Submatches(1) = "is"
.Submatches(2)= "a"
.Submatches(3)= "string"
.Submatches(4)= "that"
etc... etc....
.Submatches(12) = "it?"

I hope my question makes sense. I just need help with the
pattern string. So far, the only thing I could think of is the
following:

"(\s+(\S+)\s*)+" ?????

Got any ideas?

Robert Crandal




I don't understand why you want to use submatches.

The simplest is usually the most efficient, and I would think the simplest regex, to collect strings separated by spaces, would be "\S+". Your collection, instead of being submatches, would be matches.

e.g:
===================
Option Explicit
Function foo(s As String)
Dim re As Object, mc As Object
Const sPat As String = "\S+"
Set re = CreateObject("vbscript.regexp")
re.Global = True
re.Pattern = sPat
Set mc = re.Execute(s)

End Function



And to do something with it:

===================
Option Explicit
Function foo(s As String)
Dim re As Object, mc As Object
Dim i As Long
Const sPat As String = "(\S+)(?=\s+|$)"
Set re = CreateObject("vbscript.regexp")
re.Global = True
re.Pattern = sPat
Set mc = re.Execute(s)
For i = 0 To mc.Count - 1
Debug.Print i, mc(i)
Next i

End Function
========================


Or you could do something like:

Function foo(s as string, optional Index as long = 0)
..
..
..
..
..
foo = mc(i)
end function

Ron Rosenfeld[_2_]

Impossible with regular expressions?
 
On Sat, 23 Apr 2011 16:03:14 -0700, "Robert Crandal" wrote:

Hi everyone. So I think I finally have a better understanding of
how to use regular expressions in VBA, but I have a more
difficult question.

I am working with strings that basically contain random sets
of character strings (or "tokens") that are separated by any
number of whitespace characters. Here are some example
strings that I might encounter:

* "The age of the dog is 100 years!!!"
* "aa bb cc dd ee 01 10 111 ooo"
* " Here is a sting that contains a total of 13 tokens... got it?"

I am trying to develop a regex pattern string that will let me enumerate
or collect each of the tokens in the string. I would like to use the
Submatches() function to retrieve any "token" in the string by index.

So, using the third string above as an example, I would want
the Submatches() function to return the following:

.Submatches(0) = "Here"
.Submatches(1) = "is"
.Submatches(2)= "a"
.Submatches(3)= "string"
.Submatches(4)= "that"
etc... etc....
.Submatches(12) = "it?"

I hope my question makes sense. I just need help with the
pattern string. So far, the only thing I could think of is the
following:

"(\s+(\S+)\s*)+" ?????

Got any ideas?

Robert Crandal




One other note: Although I know you are trying to become proficient in regular expressions, this same task can be accomplished fairly simply in VBA. For example:

==================
Function SplitString(s As String) As Variant
SplitString = Split(WorksheetFunction.Trim(s))
End Function
======================

SplitString will be a zero-based array containing your collection.


All times are GMT +1. The time now is 10:09 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com