Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Another regular expression question
My input data strings have roughly the following format:
"Item1 scissors" "Item2 two notebooks" "item3" "itm4 keyboards and scissors" "item_5 glue,paper,scissors" My strings begin with an "item number" string followed by a description of the item(s). It is possible that an item description could be missing, as seen in "item3" above. (Assume that the "item number" token will always be present) What is a good regular expression that I can use to save the item number in a variable named "$ItemNum" and save the description in a variable named "$Description"?? (If the description string is missing I want the "$Description" variable to be set to the empty string. Thank you. |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Another regular expression question
On Wed, 30 Jan 2013 03:14:13 -0700, "Robert Crandal" wrote:
My input data strings have roughly the following format: "Item1 scissors" "Item2 two notebooks" "item3" "itm4 keyboards and scissors" "item_5 glue,paper,scissors" My strings begin with an "item number" string followed by a description of the item(s). It is possible that an item description could be missing, as seen in "item3" above. (Assume that the "item number" token will always be present) What is a good regular expression that I can use to save the item number in a variable named "$ItemNum" and save the description in a variable named "$Description"?? (If the description string is missing I want the "$Description" variable to be set to the empty string. Thank you. Your question requires clarification for me to respond. Since this is an Excel programming group, I would normally assume you are writing about vbscript (essentially the same as Javascript) flavor of regular expressions. Subgroup naming is not supported. But you could use subgroups to save the relevant portions of your string, and then assign them to variables within your VBA routine. However, VBA does not support names which begin with "$". If you are in the wrong group, there are flavors of regex that do support variable naming but, at least in the .NET flavor, they do not begin with the "$" as you have shown above. For example, in the .NET flavor, you might have a named capturing group called ItemNum and would use it in a replace string as $[ItemNum]. Also, there is considerable variation in your "item number" tokens, and it is not clear if you want to capture only the number (which presumeably is an integer), or the entire word. Nor is it clear what the extent of variability in your item number tokens might be, or whether we could just identify it more simply as merely (^"\w+\d+) Item item itm item_ |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Another regular expression question
Hi Ron.... I only posted this question on the Excel VBA board
because I'm using regular expressions in Excel. I forgot that I should NOT use "$" in my variable names. For now, assume that my variable names will be "sItem" and "sDescription". And yes, there is variation in my "item number", and I want to capture the ENTIRE first token string, whether it be "Item4", "item_4", or even "b6". Our data format looks roughly like this: "item number string" [option string description] So, real examples a Item1 scissors Item2 two notebooks item3 i4 keyboards and scissors item_5 glue,paper,scissors b6 broken keyboard Sorry about the confusion. I hope I made it more clear now. Please let me know if you need further clarification. Thank you. Robert "Ron Rosenfeld" wrote in message ... Your question requires clarification for me to respond. Since this is an Excel programming group, I would normally assume you are writing about vbscript (essentially the same as Javascript) flavor of regular expressions. Subgroup naming is not supported. But you could use subgroups to save the relevant portions of your string, and then assign them to variables within your VBA routine. However, VBA does not support names which begin with "$". If you are in the wrong group, there are flavors of regex that do support variable naming but, at least in the .NET flavor, they do not begin with the "$" as you have shown above. For example, in the .NET flavor, you might have a named capturing group called ItemNum and would use it in a replace string as $[ItemNum]. Also, there is considerable variation in your "item number" tokens, and it is not clear if you want to capture only the number (which presumeably is an integer), or the entire word. Nor is it clear what the extent of variability in your item number tokens might be, or whether we could just identify it more simply as merely (^"\w+\d+) Item item itm item_ |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
Another regular expression question
On Wed, 30 Jan 2013 12:01:20 -0700, "Robert Crandal" wrote:
Hi Ron.... I only posted this question on the Excel VBA board because I'm using regular expressions in Excel. I forgot that I should NOT use "$" in my variable names. For now, assume that my variable names will be "sItem" and "sDescription". And yes, there is variation in my "item number", and I want to capture the ENTIRE first token string, whether it be "Item4", "item_4", or even "b6". Our data format looks roughly like this: "item number string" [option string description] So, real examples a Item1 scissors Item2 two notebooks item3 i4 keyboards and scissors item_5 glue,paper,scissors b6 broken keyboard Sorry about the confusion. I hope I made it more clear now. Please let me know if you need further clarification. Thank you. Robert Well, the only commonality I can see is the your Item token is always the first word. Then there are some itervening spaces, followed by the rest of the string which comprises the description. That being the case, a regex such as: ^(\S+)(?:\s+(.+))? should be all that you need. Code might look like: ============================ Option Explicit Public sItem As String Public sDescription As String Sub ItemDescr(s As String) Dim re As Object, mc As Object Set re = CreateObject("vbscript.regexp") With re .Pattern = "^(\S+)(?:\s+(.+))?" .MultiLine = False .Global = True If .test(s) = True Then Set mc = .Execute(s) sItem = mc(0).submatches(0) sDescription = mc(0).submatches(1) End If End With End Sub ======================= You could call it in a variety of ways. Here's one, if your strings are in column A: ================================ Sub test() Dim c As Range For Each c In Range("A1", Cells(Rows.Count, "A").End(xlUp)) ItemDescr (c) Debug.Print sItem, sDescription Next c End Sub ================================== |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
Another regular expression question
"Ron Rosenfeld" wrote in message
... That being the case, a regex such as: ^(\S+)(?:\s+(.+))? should be all that you need. Hi Ron. Just out of curiousity, what is the meaning of the colon in the above pattern string? |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
Another regular expression question
On Sat, 2 Feb 2013 00:29:55 -0700, "Robert Crandal" wrote:
"Ron Rosenfeld" wrote in message .. . That being the case, a regex such as: ^(\S+)(?:\s+(.+))? should be all that you need. Hi Ron. Just out of curiousity, what is the meaning of the colon in the above pattern string? You should have access to a comprehensive definition of the syntax for the relevant flavor if you are going to be using regular expressions. Here is a link to the syntax used in the VBscript flavor: http://msdn.microsoft.com/en-us/libr...=vs.84%29.aspx From that link: ========================== (?:pattern) Matches pattern but does not save the match, that is, the match is not stored for possible later use. This is useful for combining parts of a pattern with the "or" character (|). industr(?:y|ies) is equivalent to industry|industries. =========================== So using this makes the regex more efficient, as it does not have to store the part of the match that includes the spaces between the two elements that we want to match. |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
R: Basic regular expression question | Excel Programming | |||
R: Basic regular expression question | Excel Programming | |||
Basic regular expression question | Excel Programming | |||
Can someone help me with this regular expression? | Excel Discussion (Misc queries) | |||
Help with regular expression | Excel Programming |