View Single Post
  #2   Report Post  
Posted to microsoft.public.excel.programming
Ron Rosenfeld[_2_] Ron Rosenfeld[_2_] is offline
external usenet poster
 
Posts: 1,045
Default Basic regular expression question

On Sat, 16 Apr 2011 15:09:58 -0700, "Robert Crandal" wrote:

I'm still a newbie at all this regular expression stuff, so forgive
this newb question....

My input data strings have the following format:

"Item1 scissors"
"Item2 notebooks"
"I3 pens"
"itm4 keyboards"
.....

So, each line is basically formatted like this:

[string of characters] [several whitespace(s)] [string of characters]

Using regular expressions, how can I store each pair of items in
separate variables??

For example, if I read in the first line above,
I would like my variable sNum to store the "Item1" string, and a
variable named sObj would store "scissors". I guess I'm really
trying to parse each pair of items and store them in variables
using regular expressions, but I don't fully understand how to
create my own regular expression pattern strings yet.

Thanks!


What you show is two words separated by space(s).

Assuming that the words contain only letters, digits and possibly an underscore, and that there are only two words in each line, the regex is fairly simple:


^(\w+)\s+(\w+)

which means:

Assert position at the beginning of the string «^»

Match the regular expression below and capture its match into backreference number 1 «(\w+)»

Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»

Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»

Match the regular expression below and capture its match into backreference number 2 «(\w+)»

Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»

A sample VBA macro which captures the Item number into the first element of 2 dimensional array; and the object into the second item of the array, might look like:

=================================
Option Explicit
Sub foo()
Dim re As Object, mc As Object
Const sPat As String = "(\w+)\s+(\w+)"
Dim InputData(0 To 3) As String
Dim i As Long
Dim Results() As String

InputData(0) = "Item1 scissors"
InputData(1) = "Item2 notebooks"
InputData(2) = "I3 pens"
InputData(3) = "itm4 keyboards"

Set re = CreateObject("vbscript.regexp")
re.Pattern = sPat
re.Global = True

For i = 0 To UBound(InputData)
If re.test(InputData(i)) = True Then
Set mc = re.Execute(InputData(i))
ReDim Preserve Results(0 To 1, 0 To i)
Results(0, i) = mc(0).submatches(0)
Results(1, i) = mc(0).submatches(1)
End If
Next i

End Sub
==========================

Hope this helps.