View Single Post
  #2   Report Post  
Posted to microsoft.public.excel.programming
Ron Rosenfeld[_2_] Ron Rosenfeld[_2_] is offline
external usenet poster
 
Posts: 1,045
Default Another regular expression question

On Wed, 30 Jan 2013 03:14:13 -0700, "Robert Crandal" wrote:

My input data strings have roughly the following format:

"Item1 scissors"
"Item2 two notebooks"
"item3"
"itm4 keyboards and scissors"
"item_5 glue,paper,scissors"

My strings begin with an "item number" string followed by a
description of the item(s). It is possible that an item description
could be missing, as seen in "item3" above. (Assume that the
"item number" token will always be present)

What is a good regular expression that I can use to save the
item number in a variable named "$ItemNum" and save the
description in a variable named "$Description"?? (If the
description string is missing I want the "$Description" variable
to be set to the empty string.

Thank you.



Your question requires clarification for me to respond.

Since this is an Excel programming group, I would normally assume you are writing about vbscript (essentially the same as Javascript) flavor of regular expressions. Subgroup naming is not supported. But you could use subgroups to save the relevant portions of your string, and then assign them to variables within your VBA routine. However, VBA does not support names which begin with "$".

If you are in the wrong group, there are flavors of regex that do support variable naming but, at least in the .NET flavor, they do not begin with the "$" as you have shown above. For example, in the .NET flavor, you might have a named capturing group called ItemNum and would use it in a replace string as $[ItemNum].

Also, there is considerable variation in your "item number" tokens, and it is not clear if you want to capture only the number (which presumeably is an integer), or the entire word. Nor is it clear what the extent of variability in your item number tokens might be, or whether we could just identify it more simply as merely (^"\w+\d+)

Item
item
itm
item_