View Single Post
  #5   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Ron Rosenfeld Ron Rosenfeld is offline
external usenet poster
 
Posts: 5,651
Default How to extract specific text from a string of characters

On Wed, 18 Feb 2009 06:58:01 -0800, rushdhih
wrote:

In order to meet client reporting requirements, we download data from our ERP
system into Excel. Our ERP system has limitations and to overcome this we
have codified client budget data. Every transaction description field entered
into the ERP system includes a 6 digit code (Eg:A52101). The problem I am
encountering in the downloaded data is there are system generated text and
other spurious text that creeps in during the conversion process. The code
appears sometime in the left, right or center of the transaction description
together with other text.

I need a formula that will check through the transaction description and
extract only the first occurence of the code.

The pattern of the code - Alpha + 5 Numeric character (no spaces)


Thank you in advance for your help



This UDF will do what you describe:

====================
Option Explicit
Function GetCode(sTxt As String) As String
Dim re As Object, mc As Object
Set re = CreateObject("vbscript.regexp")
re.Pattern = "[A-Z]\d{5}"
If re.test(sTxt) = True Then
Set mc = re.Execute(sTxt)
GetCode = mc(0).Value
End If
End Function
========================

To enter it, <alt-F11 opens the VBEditor. Ensure your project is highlighted
in the project explorer window, then Insert/Module and paste the code above
into the window that opens.

Then enter the function =GetCode(cell_ref) in some cell.

As written, this will return the first substring that consists of a capitalized
letter followed by at least 5 digits.

One of the issues that you may run into is how to delineate the code. From
your examples, it seems to be the case that the code could be followed
immediately by a capital letter (e.g. in your example 1: ...A52101Battery 6
cell;...)

Could it also be followed immediately by a number?
Will the letter be preceded by anything other than a <space.

The answers to these two questions could help construct a more robust pattern
to use in the above UDF.
--ron