![]() |
extracting info from an address block
I recieve info from clients in outlook.
The info is client addresses and phone. I would like to build a little function that when I drop the information into an excel worksheet, that it would process the information to extract the NAME ADDRESS1 ADDRESS2 CITY ST ZIP PHONE PHONE2 the info may come like name address1 city, st zip name addr1 add2 city, st zip It my have 0-2 phones on the line after the city, st zip I am not sure if it would be best to paste into the formula bar (and all take 1 cell), or to paste into a cell, and then it would be 3-7 cells. Any idea how to handle this? How could I tell if there is 1 or 2 address lines? Would it be best to just do something like: NAME = A1 ADDRESS =A2 ADDRESS2 =if(search(",",A3)0,"",A3) City =if(search(",",A3)0,left(A3,search(","A3)),left(A 4,search(",",A4))) St =?? ZIP =?? phone1 =if(search(",",A3)0,A5,A6) 'how would I format it as a phone? Phone2 =if(search(",",A3)0,A7,A8) Thanks Bruce |
extracting info from an address block
On Fri, 25 May 2007 09:48:36 -0700, "Bruce" wrote:
I recieve info from clients in outlook. The info is client addresses and phone. I would like to build a little function that when I drop the information into an excel worksheet, that it would process the information to extract the NAME ADDRESS1 ADDRESS2 CITY ST ZIP PHONE PHONE2 the info may come like name address1 city, st zip name addr1 add2 city, st zip It my have 0-2 phones on the line after the city, st zip I am not sure if it would be best to paste into the formula bar (and all take 1 cell), or to paste into a cell, and then it would be 3-7 cells. Any idea how to handle this? How could I tell if there is 1 or 2 address lines? Would it be best to just do something like: NAME = A1 ADDRESS =A2 ADDRESS2 =if(search(",",A3)0,"",A3) City =if(search(",",A3)0,left(A3,search(","A3)),left( A4,search(",",A4))) St =?? ZIP =?? phone1 =if(search(",",A3)0,A5,A6) 'how would I format it as a phone? Phone2 =if(search(",",A3)0,A7,A8) Thanks Bruce I'm not sure what you mean when you write you "recieve info from clients in outlook." However, why not just export the information from Outlook in Excel format? --ron |
extracting info from an address block
Sorry,
These come to me in the body of an email, so they are just straight text. And they come to me usually as 2 or three address in the body per email. Thanks Bruce "Ron Rosenfeld" wrote in message ... On Fri, 25 May 2007 09:48:36 -0700, "Bruce" wrote: I recieve info from clients in outlook. The info is client addresses and phone. I would like to build a little function that when I drop the information into an excel worksheet, that it would process the information to extract the NAME ADDRESS1 ADDRESS2 CITY ST ZIP PHONE PHONE2 the info may come like name address1 city, st zip name addr1 add2 city, st zip It my have 0-2 phones on the line after the city, st zip I am not sure if it would be best to paste into the formula bar (and all take 1 cell), or to paste into a cell, and then it would be 3-7 cells. Any idea how to handle this? How could I tell if there is 1 or 2 address lines? Would it be best to just do something like: NAME = A1 ADDRESS =A2 ADDRESS2 =if(search(",",A3)0,"",A3) City =if(search(",",A3)0,left(A3,search(","A3)),left (A4,search(",",A4))) St =?? ZIP =?? phone1 =if(search(",",A3)0,A5,A6) 'how would I format it as a phone? Phone2 =if(search(",",A3)0,A7,A8) Thanks Bruce I'm not sure what you mean when you write you "recieve info from clients in outlook." However, why not just export the information from Outlook in Excel format? --ron |
extracting info from an address block
On Fri, 25 May 2007 09:48:36 -0700, "Bruce" wrote:
I recieve info from clients in outlook. The info is client addresses and phone. I would like to build a little function that when I drop the information into an excel worksheet, that it would process the information to extract the NAME ADDRESS1 ADDRESS2 CITY ST ZIP PHONE PHONE2 the info may come like name address1 city, st zip name addr1 add2 city, st zip It my have 0-2 phones on the line after the city, st zip I am not sure if it would be best to paste into the formula bar (and all take 1 cell), or to paste into a cell, and then it would be 3-7 cells. Any idea how to handle this? How could I tell if there is 1 or 2 address lines? Would it be best to just do something like: NAME = A1 ADDRESS =A2 ADDRESS2 =if(search(",",A3)0,"",A3) City =if(search(",",A3)0,left(A3,search(","A3)),left( A4,search(",",A4))) St =?? ZIP =?? phone1 =if(search(",",A3)0,A5,A6) 'how would I format it as a phone? Phone2 =if(search(",",A3)0,A7,A8) Thanks Bruce This is a complicated problem, in part for the reasons you lay out. One way to solve this is by using regular expressions. This would depend on having some rules to define the various fields. For example, let us paste the data into the function bar, so all the date will be in one cell (e.g. A2). Then we can parse out the data into the subsequent columns. Label: B1: Name C1: Address 1 D1: Address 2 E1: City F1: State G1: Zip H1: Phone 1 H2: Phone 2 We have to make some assumptions though, and the accuracy of the parsing will depend on the accuracy of those assumptions. I made the following assumptions: Name: Is the first line Address 1: Is the second line that contains ONLY letters, numbers or spaces. Address 2: Is the third line that contains ONLY letters, numbers or spaces. City: Is the first line that starts with a sequence of characters -- up to but not including a comma. State: The sequence of <non-space's that follows the combination of comma<space Zip: The series of digits and or dash that follows the State. It needs to be formatted as 5 digits, optional dash, optional four digits. And it may be absent. Phone1: First line that has a sequence of digits, dashes, spaces and parentheses with a minimum of seven characters. Phone2: Second line with the above sequence. Comments: The phone number definition could be more robust, but that may not be necessary. It is fairly critical that the City, state zip line be the only line that contains a comma. The easiest way to implement the above is to download and install Longre's free morefunc.xll add-in from http://xcell05.free.fr/ Then use these formulas: B2: =REGEX.MID(A2,".*") C2: =REGEX.MID($A2,"(?m)^[\w ]*$",2) D2: =REGEX.MID($A2,"(?m)^[\w ]*$",3) E2: =REGEX.MID(A2,"(?m)^.*(?=,)") F2: =REGEX.MID(A2,"(?<=,\s)\S+") G2: =REGEX.SUBSTITUTE(A2,"(?s).*\w+,\s+\w+\s+(\d{5}-?(\d{4})?)?.*","[1]") H2: =REGEX.MID(A2,"[-()\d ]{7,}") I2: =REGEX.MID(A2,"[-()\d ]{7,}",2) If the formulas do not work, then I have probably made some assumptions about your data that are not so. So post back with sanitized examples of the problem data. Let me know if this works for you. --ron |
extracting info from an address block
On Fri, 25 May 2007 10:52:14 -0700, "Bruce" wrote:
Sorry, These come to me in the body of an email, so they are just straight text. And they come to me usually as 2 or three address in the body per email. Thanks Bruce Two changes already <g for the phone numbers: H2: =REGEX.MID(A2,"(?m)^[-()\d ]{7,}$") I2: =REGEX.MID(A2,"(?m)^[-()\d ]{7,}$",2) I had to ensure that the ONLY characters on that line were digits, dash, parentheses, or spaces. --ron |
extracting info from an address block
Thank you very much Ron!
Sorry I have not responded before, but I have been out of town and had no 'net access. Seems to be getting close! There is a little issue with phone nums, especailly if the address contains a numerical street name: Joe Doe 1259 98TH AVE Anytown, CA 92111 (886) 123 3332 gives me the results: (note the 1259 98 after the zip) Joe Doe 1259 98TH AVE Anytown CA 92111 1259 98 (886) 123 3332 Now if the address is a non numerical street, it works fine Joe Doe 1259 Main AVE Anytown, CA 92111 (886) 123 3332 Does this look like it will work properly? (it seems to, but I am not as familiar with this addin as you are!) =REGEX.MID(RIGHT(B2,LEN(B2) - SEARCH(I3,B2)),"[-()\d ]{7,}") and =REGEX.MID(RIGHT(B2,LEN(B2) - SEARCH(I3,B2)),"[-()\d ]{7,}",2) Also, I am getting problems after I added this addin. What seems to happen if I edit this so that I have the formula wizard (what you get when you select the funtion from the function dropdown), I get an error that excel has stopped responding and then it restarts. I have Excel 2007. Again, thanks for your assistance! Bruce "Ron Rosenfeld" wrote in message ... On Fri, 25 May 2007 09:48:36 -0700, "Bruce" wrote: I recieve info from clients in outlook. The info is client addresses and phone. I would like to build a little function that when I drop the information into an excel worksheet, that it would process the information to extract the NAME ADDRESS1 ADDRESS2 CITY ST ZIP PHONE PHONE2 the info may come like name address1 city, st zip name addr1 add2 city, st zip It my have 0-2 phones on the line after the city, st zip I am not sure if it would be best to paste into the formula bar (and all take 1 cell), or to paste into a cell, and then it would be 3-7 cells. Any idea how to handle this? How could I tell if there is 1 or 2 address lines? Would it be best to just do something like: NAME = A1 ADDRESS =A2 ADDRESS2 =if(search(",",A3)0,"",A3) City =if(search(",",A3)0,left(A3,search(","A3)),left (A4,search(",",A4))) St =?? ZIP =?? phone1 =if(search(",",A3)0,A5,A6) 'how would I format it as a phone? Phone2 =if(search(",",A3)0,A7,A8) Thanks Bruce This is a complicated problem, in part for the reasons you lay out. One way to solve this is by using regular expressions. This would depend on having some rules to define the various fields. For example, let us paste the data into the function bar, so all the date will be in one cell (e.g. A2). Then we can parse out the data into the subsequent columns. Label: B1: Name C1: Address 1 D1: Address 2 E1: City F1: State G1: Zip H1: Phone 1 H2: Phone 2 We have to make some assumptions though, and the accuracy of the parsing will depend on the accuracy of those assumptions. I made the following assumptions: Name: Is the first line Address 1: Is the second line that contains ONLY letters, numbers or spaces. Address 2: Is the third line that contains ONLY letters, numbers or spaces. City: Is the first line that starts with a sequence of characters -- up to but not including a comma. State: The sequence of <non-space's that follows the combination of comma<space Zip: The series of digits and or dash that follows the State. It needs to be formatted as 5 digits, optional dash, optional four digits. And it may be absent. Phone1: First line that has a sequence of digits, dashes, spaces and parentheses with a minimum of seven characters. Phone2: Second line with the above sequence. Comments: The phone number definition could be more robust, but that may not be necessary. It is fairly critical that the City, state zip line be the only line that contains a comma. The easiest way to implement the above is to download and install Longre's free morefunc.xll add-in from http://xcell05.free.fr/ Then use these formulas: B2: =REGEX.MID(A2,".*") C2: =REGEX.MID($A2,"(?m)^[\w ]*$",2) D2: =REGEX.MID($A2,"(?m)^[\w ]*$",3) E2: =REGEX.MID(A2,"(?m)^.*(?=,)") F2: =REGEX.MID(A2,"(?<=,\s)\S+") G2: =REGEX.SUBSTITUTE(A2,"(?s).*\w+,\s+\w+\s+(\d{5}-?(\d{4})?)?.*","[1]") H2: =REGEX.MID(A2,"[-()\d ]{7,}") I2: =REGEX.MID(A2,"[-()\d ]{7,}",2) If the formulas do not work, then I have probably made some assumptions about your data that are not so. So post back with sanitized examples of the problem data. Let me know if this works for you. --ron |
extracting info from an address block
On Tue, 29 May 2007 09:58:11 -0700, "Bruce" wrote:
Thank you very much Ron! Sorry I have not responded before, but I have been out of town and had no 'net access. Seems to be getting close! There is a little issue with phone nums, especailly if the address contains a numerical street name: Joe Doe 1259 98TH AVE Anytown, CA 92111 (886) 123 3332 gives me the results: (note the 1259 98 after the zip) Joe Doe 1259 98TH AVE Anytown CA 92111 1259 98 (886) 123 3332 Now if the address is a non numerical street, it works fine Joe Doe 1259 Main AVE Anytown, CA 92111 (886) 123 3332 Does this look like it will work properly? (it seems to, but I am not as familiar with this addin as you are!) =REGEX.MID(RIGHT(B2,LEN(B2) - SEARCH(I3,B2)),"[-()\d ]{7,}") and =REGEX.MID(RIGHT(B2,LEN(B2) - SEARCH(I3,B2)),"[-()\d ]{7,}",2) Also, I am getting problems after I added this addin. What seems to happen if I edit this so that I have the formula wizard (what you get when you select the funtion from the function dropdown), I get an error that excel has stopped responding and then it restarts. I have Excel 2007. Bruce, It looks as if, perhaps because my posts threaded oddly, that you are using the original and not the revised version of the phone number extraction functions. It should be -- H2: =REGEX.MID(A2,"(?m)^[-()\d ]{7,}$") I2: =REGEX.MID(A2,"(?m)^[-()\d ]{7,}$",2) Those should work. So far as I know, the add-in has not been tested with Excel 2007, and I have Excel 2003, so cannot check it. So I suppose you might be seeing an incompatibility. Similar functions to those that Longre uses can be written in VBA, if that is an issue. But first, see if the revised phone number functions work properly on your data. Then we can see about translating to VBA. --ron |
extracting info from an address block
On Tue, 29 May 2007 14:38:19 -0400, Ron Rosenfeld
wrote: On Tue, 29 May 2007 09:58:11 -0700, "Bruce" wrote: Thank you very much Ron! Sorry I have not responded before, but I have been out of town and had no 'net access. Seems to be getting close! There is a little issue with phone nums, especailly if the address contains a numerical street name: Joe Doe 1259 98TH AVE Anytown, CA 92111 (886) 123 3332 gives me the results: (note the 1259 98 after the zip) Joe Doe 1259 98TH AVE Anytown CA 92111 1259 98 (886) 123 3332 Now if the address is a non numerical street, it works fine Joe Doe 1259 Main AVE Anytown, CA 92111 (886) 123 3332 Does this look like it will work properly? (it seems to, but I am not as familiar with this addin as you are!) =REGEX.MID(RIGHT(B2,LEN(B2) - SEARCH(I3,B2)),"[-()\d ]{7,}") and =REGEX.MID(RIGHT(B2,LEN(B2) - SEARCH(I3,B2)),"[-()\d ]{7,}",2) Also, I am getting problems after I added this addin. What seems to happen if I edit this so that I have the formula wizard (what you get when you select the funtion from the function dropdown), I get an error that excel has stopped responding and then it restarts. I have Excel 2007. Bruce, It looks as if, perhaps because my posts threaded oddly, that you are using the original and not the revised version of the phone number extraction functions. It should be -- H2: =REGEX.MID(A2,"(?m)^[-()\d ]{7,}$") I2: =REGEX.MID(A2,"(?m)^[-()\d ]{7,}$",2) Those should work. So far as I know, the add-in has not been tested with Excel 2007, and I have Excel 2003, so cannot check it. So I suppose you might be seeing an incompatibility. Similar functions to those that Longre uses can be written in VBA, if that is an issue. But first, see if the revised phone number functions work properly on your data. Then we can see about translating to VBA. --ron Here's a VBA variant. It will parse the contents of "Selection" into the cells to the right: ============================================= Option Explicit Sub ParseAdrBlock() Dim c As Range Const pName As String = ".*" Const pAdr1 As String = "^[\w ]*[A-Za-z]+[\w ]*$" 'Set Multiline = True; Index 2 Const pAdr2 As String = "^[\w ]*[A-Za-z]+[\w ]*$" 'Set Multiline = True; Index 3 Const pCity As String = "^.*(?=,)" 'Set Multiline = True Const pState As String = "[\s\S]+,\s(\S+)[\s\S]+" 'Mulitline True; Return $1; resub Const pZip As String = "[\s\S]+,\s(\S+)[\s\S]+" 'multiline True; resub; Return $1 Const pPhone1 As String = "^[-()\d ]{7,}$" 'Multiline True Const pPhone2 As String = "^[-()\d ]{7,}$" 'Multiline True; Index 2 For Each c In Selection c.Range("B1", "I1").ClearContents c.Offset(0, 1).Value = REMid(c.Text, pName) c.Offset(0, 2).Value = REMid(c.Text, pAdr1, 2, , True) c.Offset(0, 3).Value = REMid(c.Text, pAdr2, 3, , True) c.Offset(0, 4).Value = REMid(c.Text, pCity, , , True) c.Offset(0, 5).Value = RESub(c.Text, pState, "$1") c.Offset(0, 6).Value = RESub(c.Text, pZip, "$1") c.Offset(0, 7).Value = REMid(c.Text, pPhone1, , , True) c.Offset(0, 8).Value = REMid(c.Text, pPhone2, 2, , True) Next c End Sub Function RESub(str As String, SrchFor As String, ReplWith As String) As String Dim objRegExp As RegExp Set objRegExp = New RegExp objRegExp.Pattern = SrchFor objRegExp.IgnoreCase = True objRegExp.Global = True objRegExp.MultiLine = True RESub = objRegExp.Replace(str, ReplWith) End Function Function REMid(str As String, Pattern As String, _ Optional Index As Variant = 1, _ Optional CaseSensitive As Boolean = True, _ Optional MultiLin As Boolean = False) _ As Variant 'Variant as value may be string or array Dim objRegExp As RegExp Dim objMatch As Match Dim colMatches As MatchCollection Dim i As Long 'counter Dim t() As String 'container for array results ' Create a regular expression object. Set objRegExp = New RegExp 'Set the pattern by using the Pattern property. objRegExp.Pattern = Pattern ' Set Case Insensitivity. objRegExp.IgnoreCase = Not CaseSensitive 'Set global applicability. objRegExp.Global = True 'Set multiline objRegExp.MultiLine = MultiLin 'Test whether the String can be compared. If (objRegExp.Test(str) = True) Then 'Get the matches. Set colMatches = objRegExp.Execute(str) ' Execute search. On Error Resume Next 'return null string if a colmatch index is non-existent If IsArray(Index) Then ReDim t(1 To UBound(Index)) For i = 1 To UBound(Index) t(i) = colMatches(Index(i) - 1) Next i REMid = t() Else REMid = CStr(colMatches(Index - 1)) If IsEmpty(REMid) Then REMid = "" End If On Error GoTo 0 'reset error handler Else REMid = "" End If End Function =================================== --ron |
extracting info from an address block
On Tue, 29 May 2007 16:26:31 -0400, Ron Rosenfeld
wrote: --ron Here's a VBA variant. It will parse the contents of "Selection" into the cells to the right: I negelected to mention that you will need to set a reference in VBA Select, from the main menu bar in the VB Editor: Tools/References Then SELECT Microsoft VBScript Regular Expressions 5.5 --ron |
extracting info from an address block
Thanks!
It seems to work well! I have not tested it to much yet. Thank you for all of your help! Bruce "Ron Rosenfeld" wrote in message ... On Tue, 29 May 2007 16:26:31 -0400, Ron Rosenfeld wrote: --ron Here's a VBA variant. It will parse the contents of "Selection" into the cells to the right: I negelected to mention that you will need to set a reference in VBA Select, from the main menu bar in the VB Editor: Tools/References Then SELECT Microsoft VBScript Regular Expressions 5.5 --ron |
extracting info from an address block
On Wed, 30 May 2007 08:37:42 -0700, "Bruce" wrote:
Thanks! It seems to work well! I have not tested it to much yet. Thank you for all of your help! Bruce Let me know if there are any problems. I noted there were some line wrap issues in how it came through, but if you've been using it, I presume you got it worked out. The differences in the regex's between the Morefunc and the VBA solutions is due to small differences in how these different programs handle regular expressions. Especially with regard to the mulitiline command. --ron |
All times are GMT +1. The time now is 11:33 PM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com