Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 395
Default Use 'like' or regex to match number-letter pattern substrings

There has to be an easier way that the one I'm currently working on...

I'm writing a UDF to help a colleague parse some key data out of some
inconsistently formatted raw data strings. One seach will be for machine part
numbers in the raw data string. All part numbers are in the format
Alpha-Alpha-Number-Number-Number, as shown in the following string:
"I pulled the engine apart and found that part MX452 was ok, but MV119 was
worn and needed to be replaced"

I've made the UDF with an input string parameter that must only contain the
characters "#" and "A", for example, "##AAA" for the above part numbers.
After performing it's VBA magic, the UDF should then return "MX452, MV119"

The actual pattern may differ for different searches; for example, searches
might be for an employee ID (#AAA), a tool ID (#######), a test number
(AAA#), or other (sub)strings. There may be a few where there will be a
non-alphanumeric character in a specific position (###-#A#).

One option is Regex, which I've used for fixed search strings, but my brain
hurts thinking about how to code Regex to dynamically deal with a user-input
string.

The other option (and what I've started) is to look at every substring in
the source string, and then compare each substring to the pattern to see if
it matches. I don't think this is very efficient, and given that there will
be thousands of records, it will also probably be painfully slow.

Example, based on the above string: Comparison string is 5 characters, so
check each 5-character substring ("I pul", " pull", "pulle", "ulled", lled ",
etc...)

I've looked at the "like" statement, and that seems to be a good lead. I can
use the user-input string to create something like:
[!0-9][!0-9][!A-Z][!A-Z][!A-Z]

And maybe use that to match each substring ("I pul", " pull", "pulle",
"ulled", lled ", etc...) but that still seems inefficient. However, that
seems to be the only way to know where the match is (to actually return it),
and to see if there is more than one match in the raw data string.

Are there any easier approaches to finding (and pulling) every substring
that matches a user-designated pattern?

Thanks,
Keith
  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1,045
Default Use 'like' or regex to match number-letter pattern substrings

On Wed, 2 Jun 2010 12:35:07 -0700, ker_01
wrote:

There has to be an easier way that the one I'm currently working on...

I'm writing a UDF to help a colleague parse some key data out of some
inconsistently formatted raw data strings. One seach will be for machine part
numbers in the raw data string. All part numbers are in the format
Alpha-Alpha-Number-Number-Number, as shown in the following string:
"I pulled the engine apart and found that part MX452 was ok, but MV119 was
worn and needed to be replaced"

I've made the UDF with an input string parameter that must only contain the
characters "#" and "A", for example, "##AAA" for the above part numbers.
After performing it's VBA magic, the UDF should then return "MX452, MV119"

The actual pattern may differ for different searches; for example, searches
might be for an employee ID (#AAA), a tool ID (#######), a test number
(AAA#), or other (sub)strings. There may be a few where there will be a
non-alphanumeric character in a specific position (###-#A#).

One option is Regex, which I've used for fixed search strings, but my brain
hurts thinking about how to code Regex to dynamically deal with a user-input
string.

The other option (and what I've started) is to look at every substring in
the source string, and then compare each substring to the pattern to see if
it matches. I don't think this is very efficient, and given that there will
be thousands of records, it will also probably be painfully slow.

Example, based on the above string: Comparison string is 5 characters, so
check each 5-character substring ("I pul", " pull", "pulle", "ulled", lled ",
etc...)

I've looked at the "like" statement, and that seems to be a good lead. I can
use the user-input string to create something like:
[!0-9][!0-9][!A-Z][!A-Z][!A-Z]

And maybe use that to match each substring ("I pul", " pull", "pulle",
"ulled", lled ", etc...) but that still seems inefficient. However, that
seems to be the only way to know where the match is (to actually return it),
and to see if there is more than one match in the raw data string.

Are there any easier approaches to finding (and pulling) every substring
that matches a user-designated pattern?

Thanks,
Keith


Just use a regex to match your pattern. Use an argument in your UDF
to determine which regex to use.

For example:

=================================
Option Explicit
Function ExtrStr(s As String, ssType As Long) As Variant
Dim re As Object, mc As Object, m As Object
Dim sPat As String
Dim sTemp() As String
Dim i As Long

Set re = CreateObject("vbscript.regexp")
Select Case ssType
Case Is = 1
sPat = "\b[A-Z]{2}\d{3}\b"
Case Is = 2
sPat = "\b[A-Z]\d{3}]\bb"
Case Is = 3
sPat = "\b\d{3}-[A-Z]\d[A-Z]\b"
Case Else
ExtrStr = CVErr(xlErrNum)
Exit Function
End Select

re.Pattern = sPat
re.Global = True
re.ignorecase = True

If re.test(s) = False Then
ExtrStr = s
Exit Function
End If

Set mc = re.Execute(s)
ReDim sTemp(0 To mc.Count - 1)
For i = 0 To mc.Count - 1
sTemp(i) = mc(i)
Next i

ExtrStr = Join(sTemp, ", ")

End Function
==========================================
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Regex Pattern Geoff K Excel Programming 5 June 12th 09 12:49 AM
Regex Pattern to extract Sheet Name from .Address(external:=True) ExcelMonkey Excel Programming 9 June 4th 08 11:17 AM
Another Problem with Regex Pattern ExcelMonkey Excel Programming 2 May 31st 08 02:53 PM
Help with a regex pattern please [email protected] Excel Programming 3 March 22nd 08 12:14 AM
Help with a Regex Pattern [email protected] Excel Programming 11 April 30th 07 01:49 AM


All times are GMT +1. The time now is 09:55 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"