ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Worksheet Functions (https://www.excelbanter.com/excel-worksheet-functions/)
-   -   How to extract specific text from a string of characters (https://www.excelbanter.com/excel-worksheet-functions/221455-how-extract-specific-text-string-characters.html)

rushdhih

How to extract specific text from a string of characters
 
In order to meet client reporting requirements, we download data from our ERP
system into Excel. Our ERP system has limitations and to overcome this we
have codified client budget data. Every transaction description field entered
into the ERP system includes a 6 digit code (Eg:A52101). The problem I am
encountering in the downloaded data is there are system generated text and
other spurious text that creeps in during the conversion process. The code
appears sometime in the left, right or center of the transaction description
together with other text.

I need a formula that will check through the transaction description and
extract only the first occurence of the code.

The pattern of the code - Alpha + 5 Numeric character (no spaces)


Thank you in advance for your help




Shane Devenshire[_2_]

How to extract specific text from a string of characters
 
Why don't you show us a few examples, technically this could be

A52101
A52101A52101
zzzA52101
dfsdsafdsa A52101 amb
asA52101bcdA52101
B123456787vA52101dsfdsf

How can you tell all these apart? How can you tell which 6 digit code is
really a code as in the last example?

--
If this helps, please click the Yes button

Cheers,
Shane Devenshire


"rushdhih" wrote:

In order to meet client reporting requirements, we download data from our ERP
system into Excel. Our ERP system has limitations and to overcome this we
have codified client budget data. Every transaction description field entered
into the ERP system includes a 6 digit code (Eg:A52101). The problem I am
encountering in the downloaded data is there are system generated text and
other spurious text that creeps in during the conversion process. The code
appears sometime in the left, right or center of the transaction description
together with other text.

I need a formula that will check through the transaction description and
extract only the first occurence of the code.

The pattern of the code - Alpha + 5 Numeric character (no spaces)


Thank you in advance for your help




rushdhih

How to extract specific text from a string of characters
 
here are some actual data from which the code needs to be extracted. As you
can see the code appears sometimes in the begining preceded by a semicolon,
in the middle or towards the end.

I have marked as 1,2,3 & 4 for clarity to show the possibilities

1)AP Accruals; A52101Battery 6 cell; A52101 Battery 6 cell; LKOC_RH_Toshiba
Re-Inv#3004461
2); ; A52101 Internet usage charges on Dialog USB Modem for December 2008;
3)AP Accruals; A52101 Internet usage charges; A52101 Internet usage charges
on Dialog USB Modem for December 2008; LKOC_RH_Dialog Internet-Dec 08
4)2008 PO ACCRUAL REVERSAL; ; A52101 PCB;

Thank you for any help you can give.



"Shane Devenshire" wrote:

Why don't you show us a few examples, technically this could be

A52101
A52101A52101
zzzA52101
dfsdsafdsa A52101 amb
asA52101bcdA52101
B123456787vA52101dsfdsf

How can you tell all these apart? How can you tell which 6 digit code is
really a code as in the last example?

--
If this helps, please click the Yes button

Cheers,
Shane Devenshire


"rushdhih" wrote:

In order to meet client reporting requirements, we download data from our ERP
system into Excel. Our ERP system has limitations and to overcome this we
have codified client budget data. Every transaction description field entered
into the ERP system includes a 6 digit code (Eg:A52101). The problem I am
encountering in the downloaded data is there are system generated text and
other spurious text that creeps in during the conversion process. The code
appears sometime in the left, right or center of the transaction description
together with other text.

I need a formula that will check through the transaction description and
extract only the first occurence of the code.

The pattern of the code - Alpha + 5 Numeric character (no spaces)


Thank you in advance for your help




Gary''s Student

How to extract specific text from a string of characters
 
Consider the following User Defined Function:

Function lookit(r As Range) As String
Dim s As String
s = r.Value
Dim outputt(5) As String
iPlace = 0
For i = 1 To Len(s)
ch = Mid(s, i, 1)
If iPlace = 0 Then
If caps(ch) Then
outputt(iPlace) = ch
iPlace = iPlace + 1
End If
Else
If digit(ch) Then
outputt(iPlace) = ch
iPlace = iPlace + 1
If iPlace = 6 Then Exit For
Else
iPlace = 0
End If
End If
Next
lookit = Join(outputt, "")
End Function


Function digit(v As Variant) As Boolean
v2 = Array("1", "2", "3", "4", "5", "6", "7", "8", "9", "0")
digit = False
For i = LBound(v2) To UBound(v2)
If v = v2(i) Then
digit = True
Exit Function
End If
Next
End Function


Function caps(v As Variant) As Boolean
Strn = "A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y ,Z"
v2 = Split(Strn, ",")
caps = False
For i = LBound(v2) To UBound(v2)
If v = v2(i) Then
caps = True
Exit Function
End If
Next
End Function

So, for example, if A1 contained:
87326428A12345lksfasjk
=lookit(A1) would display:

A12345

--
Gary''s Student - gsnu200834


"rushdhih" wrote:

In order to meet client reporting requirements, we download data from our ERP
system into Excel. Our ERP system has limitations and to overcome this we
have codified client budget data. Every transaction description field entered
into the ERP system includes a 6 digit code (Eg:A52101). The problem I am
encountering in the downloaded data is there are system generated text and
other spurious text that creeps in during the conversion process. The code
appears sometime in the left, right or center of the transaction description
together with other text.

I need a formula that will check through the transaction description and
extract only the first occurence of the code.

The pattern of the code - Alpha + 5 Numeric character (no spaces)


Thank you in advance for your help




Ron Rosenfeld

How to extract specific text from a string of characters
 
On Wed, 18 Feb 2009 06:58:01 -0800, rushdhih
wrote:

In order to meet client reporting requirements, we download data from our ERP
system into Excel. Our ERP system has limitations and to overcome this we
have codified client budget data. Every transaction description field entered
into the ERP system includes a 6 digit code (Eg:A52101). The problem I am
encountering in the downloaded data is there are system generated text and
other spurious text that creeps in during the conversion process. The code
appears sometime in the left, right or center of the transaction description
together with other text.

I need a formula that will check through the transaction description and
extract only the first occurence of the code.

The pattern of the code - Alpha + 5 Numeric character (no spaces)


Thank you in advance for your help



This UDF will do what you describe:

====================
Option Explicit
Function GetCode(sTxt As String) As String
Dim re As Object, mc As Object
Set re = CreateObject("vbscript.regexp")
re.Pattern = "[A-Z]\d{5}"
If re.test(sTxt) = True Then
Set mc = re.Execute(sTxt)
GetCode = mc(0).Value
End If
End Function
========================

To enter it, <alt-F11 opens the VBEditor. Ensure your project is highlighted
in the project explorer window, then Insert/Module and paste the code above
into the window that opens.

Then enter the function =GetCode(cell_ref) in some cell.

As written, this will return the first substring that consists of a capitalized
letter followed by at least 5 digits.

One of the issues that you may run into is how to delineate the code. From
your examples, it seems to be the case that the code could be followed
immediately by a capital letter (e.g. in your example 1: ...A52101Battery 6
cell;...)

Could it also be followed immediately by a number?
Will the letter be preceded by anything other than a <space.

The answers to these two questions could help construct a more robust pattern
to use in the above UDF.
--ron

Ron Rosenfeld

How to extract specific text from a string of characters
 
On Wed, 18 Feb 2009 06:58:01 -0800, rushdhih
wrote:

In order to meet client reporting requirements, we download data from our ERP
system into Excel. Our ERP system has limitations and to overcome this we
have codified client budget data. Every transaction description field entered
into the ERP system includes a 6 digit code (Eg:A52101). The problem I am
encountering in the downloaded data is there are system generated text and
other spurious text that creeps in during the conversion process. The code
appears sometime in the left, right or center of the transaction description
together with other text.

I need a formula that will check through the transaction description and
extract only the first occurence of the code.

The pattern of the code - Alpha + 5 Numeric character (no spaces)


Thank you in advance for your help



Here's another UDF that does not use Regular Expressions:

========================
Option Explicit
Option Compare Binary
Function GetCode(sTxt As String) As String
Const sPattern As String = "[A-Z]#####"
Dim i As Long
For i = 1 To Len(sTxt) - 6
If Mid(sTxt, i, 6) Like sPattern Then
GetCode = Mid(sTxt, i, 6)
Exit Function
End If
Next i
End Function
=================================
--ron

rushdhih

How to extract specific text from a string of characters
 
Thank you so much. It worked perfectly for me.

"Ron Rosenfeld" wrote:

On Wed, 18 Feb 2009 06:58:01 -0800, rushdhih
wrote:

In order to meet client reporting requirements, we download data from our ERP
system into Excel. Our ERP system has limitations and to overcome this we
have codified client budget data. Every transaction description field entered
into the ERP system includes a 6 digit code (Eg:A52101). The problem I am
encountering in the downloaded data is there are system generated text and
other spurious text that creeps in during the conversion process. The code
appears sometime in the left, right or center of the transaction description
together with other text.

I need a formula that will check through the transaction description and
extract only the first occurence of the code.

The pattern of the code - Alpha + 5 Numeric character (no spaces)


Thank you in advance for your help



Here's another UDF that does not use Regular Expressions:

========================
Option Explicit
Option Compare Binary
Function GetCode(sTxt As String) As String
Const sPattern As String = "[A-Z]#####"
Dim i As Long
For i = 1 To Len(sTxt) - 6
If Mid(sTxt, i, 6) Like sPattern Then
GetCode = Mid(sTxt, i, 6)
Exit Function
End If
Next i
End Function
=================================
--ron


Ron Rosenfeld

How to extract specific text from a string of characters
 
On Wed, 18 Feb 2009 20:39:01 -0800, rushdhih
wrote:

Thank you so much. It worked perfectly for me.


You're welcome. Thanks for the feedback.
--ron


All times are GMT +1. The time now is 07:26 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
ExcelBanter.com