Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 158
Default Basic regular expression question

I'm still a newbie at all this regular expression stuff, so forgive
this newb question....

My input data strings have the following format:

"Item1 scissors"
"Item2 notebooks"
"I3 pens"
"itm4 keyboards"
......

So, each line is basically formatted like this:

[string of characters] [several whitespace(s)] [string of characters]

Using regular expressions, how can I store each pair of items in
separate variables??

For example, if I read in the first line above,
I would like my variable sNum to store the "Item1" string, and a
variable named sObj would store "scissors". I guess I'm really
trying to parse each pair of items and store them in variables
using regular expressions, but I don't fully understand how to
create my own regular expression pattern strings yet.

Thanks!


  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1,045
Default Basic regular expression question

On Sat, 16 Apr 2011 15:09:58 -0700, "Robert Crandal" wrote:

I'm still a newbie at all this regular expression stuff, so forgive
this newb question....

My input data strings have the following format:

"Item1 scissors"
"Item2 notebooks"
"I3 pens"
"itm4 keyboards"
.....

So, each line is basically formatted like this:

[string of characters] [several whitespace(s)] [string of characters]

Using regular expressions, how can I store each pair of items in
separate variables??

For example, if I read in the first line above,
I would like my variable sNum to store the "Item1" string, and a
variable named sObj would store "scissors". I guess I'm really
trying to parse each pair of items and store them in variables
using regular expressions, but I don't fully understand how to
create my own regular expression pattern strings yet.

Thanks!


What you show is two words separated by space(s).

Assuming that the words contain only letters, digits and possibly an underscore, and that there are only two words in each line, the regex is fairly simple:


^(\w+)\s+(\w+)

which means:

Assert position at the beginning of the string «^»

Match the regular expression below and capture its match into backreference number 1 «(\w+)»

Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»

Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»

Match the regular expression below and capture its match into backreference number 2 «(\w+)»

Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»

A sample VBA macro which captures the Item number into the first element of 2 dimensional array; and the object into the second item of the array, might look like:

=================================
Option Explicit
Sub foo()
Dim re As Object, mc As Object
Const sPat As String = "(\w+)\s+(\w+)"
Dim InputData(0 To 3) As String
Dim i As Long
Dim Results() As String

InputData(0) = "Item1 scissors"
InputData(1) = "Item2 notebooks"
InputData(2) = "I3 pens"
InputData(3) = "itm4 keyboards"

Set re = CreateObject("vbscript.regexp")
re.Pattern = sPat
re.Global = True

For i = 0 To UBound(InputData)
If re.test(InputData(i)) = True Then
Set mc = re.Execute(InputData(i))
ReDim Preserve Results(0 To 1, 0 To i)
Results(0, i) = mc(0).submatches(0)
Results(1, i) = mc(0).submatches(1)
End If
Next i

End Sub
==========================

Hope this helps.
  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 158
Default Basic regular expression question

Hello Ron! I just wanted to say thank so much for your excellent help
again.
That code is working great!

I have a new question now. I just realized that the third element can
actually
contain multiple word elements. So, my data might actually look like this:

"Item1 scissors"
"Item2 red notebooks"
"Item3 number #2 pencils"

So, the data format really is:

[single string of characters] [whitespace(s)] [any string of characters
and optional whitespace(s)]

So....

I plan to basically reuse the code you gave me previously, but I need to
modify
the regular expression pattern so that the variable mc(0).submatches(1)
would get assigned strings like "scissors", or "red notebooks", or
"number #2 pencils"

How should I change the pattern string?

Thankx!



"Ron Rosenfeld" wrote in message
...
On Sat, 16 Apr 2011 15:09:58 -0700, "Robert Crandal"
wrote:


What you show is two words separated by space(s).


  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,934
Default Basic regular expression question

I have a new question now. I just realized that the third element
can actually contain multiple word elements. So, my data might
actually look Like this:

"Item1 scissors"
"Item2 red notebooks"
"Item3 number #2 pencils"

So, the data format really is:

[single string of characters] [whitespace(s)] [any string of
characters and optional whitespace(s)]

So....

I plan to basically reuse the code you gave me previously, but I
need to Modify the regular expression pattern so that the variable
mc(0).submatches(1) would get assigned strings like "scissors", or
"red notebooks", or "number #2 pencils"

How should I change the pattern string?


I'm not sure if this will be helpful to you or not as I think you are
attempting to learn how to program with Regular Expressions; however,
assuming those "whitespaces" you mentioned are simply normal spaces, you can
do what you have asked without using Regular Expressions... straight VB is
enough. Here is Ron's macro revised to perform without Regular Expressions
and modified to handle the multipli-spaced data you just posted about...

Sub FooToo()
Dim i As Long, InputData(0 To 3) As String, Parts() As String, Results()
As String
InputData(0) = "Item1 scissors"
InputData(1) = "Item2 notebooks"
InputData(2) = "Item2 red notebooks"
InputData(3) = "Item3 number #2 pencils"
For i = 0 To UBound(InputData)
If InStr(InputData(i), " ") Then
Parts = Split(WorksheetFunction.Trim(InputData(i)), " ", 2)
ReDim Preserve Results(0 To 1, 0 To i)
Results(0, i) = Parts(0)
Results(1, i) = Parts(1)
End If
Next i
End Sub

Rick Rothstein (MVP - Excel)

  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1,045
Default Basic regular expression question

On Sat, 16 Apr 2011 22:37:36 -0700, "Robert Crandal" wrote:

I plan to basically reuse the code you gave me previously, but I need to
modify
the regular expression pattern so that the variable mc(0).submatches(1)
would get assigned strings like "scissors", or "red notebooks", or
"number #2 pencils"

How should I change the pattern string?

Thankx!


If the "object" will be all on the same line:

"^(\w+)\s+(.+)"

However, because of the peculiarities of MS implementation in vba, if the "object" might span a second line, then you should use:

"^(\w+)\s+([\s\S]+)"

As an aid to writing and testing regular expressions, I would suggest a program titled RegexBuddy (www.regexbuddy.com)

And, as Rick is so fond of pointing out, you can do most anything using built-in VBA methods without using Regular Expressions, and they will often run more quickly if that is an issue. However, once you become fluent in Regular Expressions, it takes much less time to develop complex string manipulations using them than using VBA.

Of course, if speed is paramount, I suppose we should be writing in machine language <g.
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular Expression Help on syntax Jason Excel Programming 25 January 10th 10 11:49 AM
Can someone help me with this regular expression? [email protected] Excel Discussion (Misc queries) 3 March 10th 09 07:36 PM
Regular Expression Conditionals (?(if)then|else) in VBA? Lazzaroni Excel Programming 7 October 8th 08 09:53 PM
Help with regular expression PO Excel Programming 3 May 2nd 07 01:39 PM
Regular Expression sl Excel Discussion (Misc queries) 2 January 23rd 07 11:57 PM


All times are GMT +1. The time now is 08:01 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"