Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
How to extract specific text from a string of characters
In order to meet client reporting requirements, we download data from our ERP
system into Excel. Our ERP system has limitations and to overcome this we have codified client budget data. Every transaction description field entered into the ERP system includes a 6 digit code (Eg:A52101). The problem I am encountering in the downloaded data is there are system generated text and other spurious text that creeps in during the conversion process. The code appears sometime in the left, right or center of the transaction description together with other text. I need a formula that will check through the transaction description and extract only the first occurence of the code. The pattern of the code - Alpha + 5 Numeric character (no spaces) Thank you in advance for your help |
#2
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
How to extract specific text from a string of characters
Why don't you show us a few examples, technically this could be
A52101 A52101A52101 zzzA52101 dfsdsafdsa A52101 amb asA52101bcdA52101 B123456787vA52101dsfdsf How can you tell all these apart? How can you tell which 6 digit code is really a code as in the last example? -- If this helps, please click the Yes button Cheers, Shane Devenshire "rushdhih" wrote: In order to meet client reporting requirements, we download data from our ERP system into Excel. Our ERP system has limitations and to overcome this we have codified client budget data. Every transaction description field entered into the ERP system includes a 6 digit code (Eg:A52101). The problem I am encountering in the downloaded data is there are system generated text and other spurious text that creeps in during the conversion process. The code appears sometime in the left, right or center of the transaction description together with other text. I need a formula that will check through the transaction description and extract only the first occurence of the code. The pattern of the code - Alpha + 5 Numeric character (no spaces) Thank you in advance for your help |
#3
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
How to extract specific text from a string of characters
here are some actual data from which the code needs to be extracted. As you
can see the code appears sometimes in the begining preceded by a semicolon, in the middle or towards the end. I have marked as 1,2,3 & 4 for clarity to show the possibilities 1)AP Accruals; A52101Battery 6 cell; A52101 Battery 6 cell; LKOC_RH_Toshiba Re-Inv#3004461 2); ; A52101 Internet usage charges on Dialog USB Modem for December 2008; 3)AP Accruals; A52101 Internet usage charges; A52101 Internet usage charges on Dialog USB Modem for December 2008; LKOC_RH_Dialog Internet-Dec 08 4)2008 PO ACCRUAL REVERSAL; ; A52101 PCB; Thank you for any help you can give. "Shane Devenshire" wrote: Why don't you show us a few examples, technically this could be A52101 A52101A52101 zzzA52101 dfsdsafdsa A52101 amb asA52101bcdA52101 B123456787vA52101dsfdsf How can you tell all these apart? How can you tell which 6 digit code is really a code as in the last example? -- If this helps, please click the Yes button Cheers, Shane Devenshire "rushdhih" wrote: In order to meet client reporting requirements, we download data from our ERP system into Excel. Our ERP system has limitations and to overcome this we have codified client budget data. Every transaction description field entered into the ERP system includes a 6 digit code (Eg:A52101). The problem I am encountering in the downloaded data is there are system generated text and other spurious text that creeps in during the conversion process. The code appears sometime in the left, right or center of the transaction description together with other text. I need a formula that will check through the transaction description and extract only the first occurence of the code. The pattern of the code - Alpha + 5 Numeric character (no spaces) Thank you in advance for your help |
#4
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
How to extract specific text from a string of characters
Consider the following User Defined Function:
Function lookit(r As Range) As String Dim s As String s = r.Value Dim outputt(5) As String iPlace = 0 For i = 1 To Len(s) ch = Mid(s, i, 1) If iPlace = 0 Then If caps(ch) Then outputt(iPlace) = ch iPlace = iPlace + 1 End If Else If digit(ch) Then outputt(iPlace) = ch iPlace = iPlace + 1 If iPlace = 6 Then Exit For Else iPlace = 0 End If End If Next lookit = Join(outputt, "") End Function Function digit(v As Variant) As Boolean v2 = Array("1", "2", "3", "4", "5", "6", "7", "8", "9", "0") digit = False For i = LBound(v2) To UBound(v2) If v = v2(i) Then digit = True Exit Function End If Next End Function Function caps(v As Variant) As Boolean Strn = "A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y ,Z" v2 = Split(Strn, ",") caps = False For i = LBound(v2) To UBound(v2) If v = v2(i) Then caps = True Exit Function End If Next End Function So, for example, if A1 contained: 87326428A12345lksfasjk =lookit(A1) would display: A12345 -- Gary''s Student - gsnu200834 "rushdhih" wrote: In order to meet client reporting requirements, we download data from our ERP system into Excel. Our ERP system has limitations and to overcome this we have codified client budget data. Every transaction description field entered into the ERP system includes a 6 digit code (Eg:A52101). The problem I am encountering in the downloaded data is there are system generated text and other spurious text that creeps in during the conversion process. The code appears sometime in the left, right or center of the transaction description together with other text. I need a formula that will check through the transaction description and extract only the first occurence of the code. The pattern of the code - Alpha + 5 Numeric character (no spaces) Thank you in advance for your help |
#5
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
How to extract specific text from a string of characters
On Wed, 18 Feb 2009 06:58:01 -0800, rushdhih
wrote: In order to meet client reporting requirements, we download data from our ERP system into Excel. Our ERP system has limitations and to overcome this we have codified client budget data. Every transaction description field entered into the ERP system includes a 6 digit code (Eg:A52101). The problem I am encountering in the downloaded data is there are system generated text and other spurious text that creeps in during the conversion process. The code appears sometime in the left, right or center of the transaction description together with other text. I need a formula that will check through the transaction description and extract only the first occurence of the code. The pattern of the code - Alpha + 5 Numeric character (no spaces) Thank you in advance for your help This UDF will do what you describe: ==================== Option Explicit Function GetCode(sTxt As String) As String Dim re As Object, mc As Object Set re = CreateObject("vbscript.regexp") re.Pattern = "[A-Z]\d{5}" If re.test(sTxt) = True Then Set mc = re.Execute(sTxt) GetCode = mc(0).Value End If End Function ======================== To enter it, <alt-F11 opens the VBEditor. Ensure your project is highlighted in the project explorer window, then Insert/Module and paste the code above into the window that opens. Then enter the function =GetCode(cell_ref) in some cell. As written, this will return the first substring that consists of a capitalized letter followed by at least 5 digits. One of the issues that you may run into is how to delineate the code. From your examples, it seems to be the case that the code could be followed immediately by a capital letter (e.g. in your example 1: ...A52101Battery 6 cell;...) Could it also be followed immediately by a number? Will the letter be preceded by anything other than a <space. The answers to these two questions could help construct a more robust pattern to use in the above UDF. --ron |
#6
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
How to extract specific text from a string of characters
On Wed, 18 Feb 2009 06:58:01 -0800, rushdhih
wrote: In order to meet client reporting requirements, we download data from our ERP system into Excel. Our ERP system has limitations and to overcome this we have codified client budget data. Every transaction description field entered into the ERP system includes a 6 digit code (Eg:A52101). The problem I am encountering in the downloaded data is there are system generated text and other spurious text that creeps in during the conversion process. The code appears sometime in the left, right or center of the transaction description together with other text. I need a formula that will check through the transaction description and extract only the first occurence of the code. The pattern of the code - Alpha + 5 Numeric character (no spaces) Thank you in advance for your help Here's another UDF that does not use Regular Expressions: ======================== Option Explicit Option Compare Binary Function GetCode(sTxt As String) As String Const sPattern As String = "[A-Z]#####" Dim i As Long For i = 1 To Len(sTxt) - 6 If Mid(sTxt, i, 6) Like sPattern Then GetCode = Mid(sTxt, i, 6) Exit Function End If Next i End Function ================================= --ron |
#7
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
How to extract specific text from a string of characters
Thank you so much. It worked perfectly for me.
"Ron Rosenfeld" wrote: On Wed, 18 Feb 2009 06:58:01 -0800, rushdhih wrote: In order to meet client reporting requirements, we download data from our ERP system into Excel. Our ERP system has limitations and to overcome this we have codified client budget data. Every transaction description field entered into the ERP system includes a 6 digit code (Eg:A52101). The problem I am encountering in the downloaded data is there are system generated text and other spurious text that creeps in during the conversion process. The code appears sometime in the left, right or center of the transaction description together with other text. I need a formula that will check through the transaction description and extract only the first occurence of the code. The pattern of the code - Alpha + 5 Numeric character (no spaces) Thank you in advance for your help Here's another UDF that does not use Regular Expressions: ======================== Option Explicit Option Compare Binary Function GetCode(sTxt As String) As String Const sPattern As String = "[A-Z]#####" Dim i As Long For i = 1 To Len(sTxt) - 6 If Mid(sTxt, i, 6) Like sPattern Then GetCode = Mid(sTxt, i, 6) Exit Function End If Next i End Function ================================= --ron |
#8
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
How to extract specific text from a string of characters
On Wed, 18 Feb 2009 20:39:01 -0800, rushdhih
wrote: Thank you so much. It worked perfectly for me. You're welcome. Thanks for the feedback. --ron |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
extact text string from specific cell except three last characters | Excel Worksheet Functions | |||
Formula to extract a specific word from text string | Excel Worksheet Functions | |||
Extract specific value from a long text string | Excel Worksheet Functions | |||
Text String - Specific Characters | Excel Worksheet Functions | |||
trim a string by specific number of characters | Excel Discussion (Misc queries) |