![]() |
Spliting rather mixed up address
Hello,
On an spread sheet on a single column I receive rather mixed up address that I need to separate the street address into one column and city in another and state into 3rd column and off course zip on 4th column... Here is some off the address I receive: 1295 GAUGUIN CIR ALISOVIEJO CA 926563875 93316 S ARAPAHO DR SANTA ANA CA 927042403 9345 GAUGUIN Cm ALISOVIEJO CA 926563875 2344268 ORANGE AVE APT 3 LAKE FOREST CA 926304886 361449 S EL C AMINO REAL APT 4 SAN CLEMENTE CA 926723451 235521 EL REPOSA ALISOVIEJO CA 926561109 106556 CAIXE DEL CERRO UNIT 1414 SAN CLEMENTE CA 926726075 204475 S BIRCH ST SANTA ANA CA 927072703 125542 HALLADAY ST SANTA ANA CA 927071407 103319 W MYRTLE ST SANTA ANA CA 927033911 3342 ENTERPRISE APT 4106 ALISOVIEJO CA 926567091 6100EDINGER AVE APT 326 HUNTJNGTNBCH CA 926473266 As you may have noticed not only there are a lot of errors on the spelling but also city names sometimes come in two words i.e. Santa Ana or ALISOVIEJO which should be "Aliso Viejo" and as you see sometimes "6100 Edinger" comes like "6100EDINGER" I search for "Split Address" on the Forum and I did not see anything that can tackle such a complicated situation... Splitting each Word into a separate column does not work because sometimes city names are two word as I explained above. I know I need to do some kind of hand cleaning before/after I run any special macro that can handle such a task but at this time I am kind of baffled how to this huge task... I came up with idea that if I scan the address column and take the last piece of character with a leading and trailing space i.e. 926473266 and stick it into Zip column and prior leading/trailing space as State and stick it into State Column then I have one more to strip of the Address strings and that would be City name which unfortunately because I have miss spelling and sometimes two word as city name I have problem€¦ For city name sometimes I have to pick up two words prior to State and Zip sometimes only one word and sometime I have spelling problem€¦ On line one of the sample address above for example 927042403 can be picked up and stick into Zip code column, CA can be picked up and stick into State Column but there is no city name by the name of ANA in California so I have to pick up Word before ANA too which is SANTA ANA and stick it into city name column and for the last example address line HUNTJNGTNBCH has spelling problem along with two word city name and the correct name is Huntington Beach or Bch€¦. A macro that can look up the zip code and give me a spelling suggestion or correct city name would be awesome. Honestly programming such a complicated macro is way above my knowledge of VBA programming strengths. I thought if I posted it somebody may give me some suggestion or there is an already Add-in macro exist etc to solve such a problem€¦ Regards -- Jeff B Paarsa |
Spliting rather mixed up address
Jeffery B Paarsa wrote:
Hello, On an spread sheet on a single column I receive rather mixed up address that I need to separate the street address into one column and city in another and state into 3rd column and off course zip on 4th column... Here is some off the address I receive: 1295 GAUGUIN CIR ALISOVIEJO CA 926563875 93316 S ARAPAHO DR SANTA ANA CA 927042403 9345 GAUGUIN Cm ALISOVIEJO CA 926563875 2344268 ORANGE AVE APT 3 LAKE FOREST CA 926304886 361449 S EL C AMINO REAL APT 4 SAN CLEMENTE CA 926723451 235521 EL REPOSA ALISOVIEJO CA 926561109 106556 CAIXE DEL CERRO UNIT 1414 SAN CLEMENTE CA 926726075 204475 S BIRCH ST SANTA ANA CA 927072703 125542 HALLADAY ST SANTA ANA CA 927071407 103319 W MYRTLE ST SANTA ANA CA 927033911 3342 ENTERPRISE APT 4106 ALISOVIEJO CA 926567091 6100EDINGER AVE APT 326 HUNTJNGTNBCH CA 926473266 As you may have noticed not only there are a lot of errors on the spelling but also city names sometimes come in two words i.e. Santa Ana or ALISOVIEJO which should be "Aliso Viejo" and as you see sometimes "6100 Edinger" comes like "6100EDINGER" I search for "Split Address" on the Forum and I did not see anything that can tackle such a complicated situation... Splitting each Word into a separate column does not work because sometimes city names are two word as I explained above. I know I need to do some kind of hand cleaning before/after I run any special macro that can handle such a task but at this time I am kind of baffled how to this huge task... I came up with idea that if I scan the address column and take the last piece of character with a leading and trailing space i.e. 926473266 and stick it into Zip column and prior leading/trailing space as State and stick it into State Column then I have one more to strip of the Address strings and that would be City name which unfortunately because I have miss spelling and sometimes two word as city name I have problem€¦ For city name sometimes I have to pick up two words prior to State and Zip sometimes only one word and sometime I have spelling problem€¦ On line one of the sample address above for example 927042403 can be picked up and stick into Zip code column, CA can be picked up and stick into State Column but there is no city name by the name of ANA in California so I have to pick up Word before ANA too which is SANTA ANA and stick it into city name column and for the last example address line HUNTJNGTNBCH has spelling problem along with two word city name and the correct name is Huntington Beach or Bch€¦. A macro that can look up the zip code and give me a spelling suggestion or correct city name would be awesome. Honestly programming such a complicated macro is way above my knowledge of VBA programming strengths. I thought if I posted it somebody may give me some suggestion or there is an already Add-in macro exist etc to solve such a problem€¦ Regards Take heart, a programmatic approach to parsing user data entry is beyond just about everyone except Google. Yours is the best example I've seen of the worst possible data to parse, and highlights why there is no single solution to these kinds of problems. Given what you have, I think you have a good approach, but you have not said how many rows you have before you. Depending on this, it could be more efficient to parse the data by hand and eye rather than spend the time to program what will ultimately be an imperfect solution at best. Regarding that programmatic approach... look for low-hanging fruit, divide, and conquer(?). At a minimum, you will need to master the VBA functions LEFT, RIGHT, MID, TRIM, INSTR, and SPLIT. You already mentioned some of this: - strip off the right-most word if it is all numbers, store as ZIP - strip off the next word at right if it is two characters and looks like a state abbreviation, store as state - strip off the leading numeric characters from the left, store as street number - scan the remainder for a digit [0-9]. If found, take the left words as the address and the right words as the city - if no digits are found, scan for common street designations like "ST", "AVE", etc. If found, take the left words as the address and the right words as the city Most of all, good luck! |
Spliting rather mixed up address
There is more than one way to skin a cat!
i suggest you get a ZIP code database. They are available on the Web. Since your records include the ZIP code, you can lookup the state/city information from the database rather than try to parse the string. You will stil have to parse the front-end of the string to get street information, but this simplifies the task for you. -- Gary''s Student - gsnu200857 "Jeffery B Paarsa" wrote: Hello, On an spread sheet on a single column I receive rather mixed up address that I need to separate the street address into one column and city in another and state into 3rd column and off course zip on 4th column... Here is some off the address I receive: 1295 GAUGUIN CIR ALISOVIEJO CA 926563875 93316 S ARAPAHO DR SANTA ANA CA 927042403 9345 GAUGUIN Cm ALISOVIEJO CA 926563875 2344268 ORANGE AVE APT 3 LAKE FOREST CA 926304886 361449 S EL C AMINO REAL APT 4 SAN CLEMENTE CA 926723451 235521 EL REPOSA ALISOVIEJO CA 926561109 106556 CAIXE DEL CERRO UNIT 1414 SAN CLEMENTE CA 926726075 204475 S BIRCH ST SANTA ANA CA 927072703 125542 HALLADAY ST SANTA ANA CA 927071407 103319 W MYRTLE ST SANTA ANA CA 927033911 3342 ENTERPRISE APT 4106 ALISOVIEJO CA 926567091 6100EDINGER AVE APT 326 HUNTJNGTNBCH CA 926473266 As you may have noticed not only there are a lot of errors on the spelling but also city names sometimes come in two words i.e. Santa Ana or ALISOVIEJO which should be "Aliso Viejo" and as you see sometimes "6100 Edinger" comes like "6100EDINGER" I search for "Split Address" on the Forum and I did not see anything that can tackle such a complicated situation... Splitting each Word into a separate column does not work because sometimes city names are two word as I explained above. I know I need to do some kind of hand cleaning before/after I run any special macro that can handle such a task but at this time I am kind of baffled how to this huge task... I came up with idea that if I scan the address column and take the last piece of character with a leading and trailing space i.e. 926473266 and stick it into Zip column and prior leading/trailing space as State and stick it into State Column then I have one more to strip of the Address strings and that would be City name which unfortunately because I have miss spelling and sometimes two word as city name I have problem€¦ For city name sometimes I have to pick up two words prior to State and Zip sometimes only one word and sometime I have spelling problem€¦ On line one of the sample address above for example 927042403 can be picked up and stick into Zip code column, CA can be picked up and stick into State Column but there is no city name by the name of ANA in California so I have to pick up Word before ANA too which is SANTA ANA and stick it into city name column and for the last example address line HUNTJNGTNBCH has spelling problem along with two word city name and the correct name is Huntington Beach or Bch€¦. A macro that can look up the zip code and give me a spelling suggestion or correct city name would be awesome. Honestly programming such a complicated macro is way above my knowledge of VBA programming strengths. I thought if I posted it somebody may give me some suggestion or there is an already Add-in macro exist etc to solve such a problem€¦ Regards -- Jeff B Paarsa |
Spliting rather mixed up address
Any code or macro already avilable anywhere to download or purchase?
-- Jeff B Paarsa "Gary''s Student" wrote: There is more than one way to skin a cat! i suggest you get a ZIP code database. They are available on the Web. Since your records include the ZIP code, you can lookup the state/city information from the database rather than try to parse the string. You will stil have to parse the front-end of the string to get street information, but this simplifies the task for you. -- Gary''s Student - gsnu200857 "Jeffery B Paarsa" wrote: Hello, On an spread sheet on a single column I receive rather mixed up address that I need to separate the street address into one column and city in another and state into 3rd column and off course zip on 4th column... Here is some off the address I receive: 1295 GAUGUIN CIR ALISOVIEJO CA 926563875 93316 S ARAPAHO DR SANTA ANA CA 927042403 9345 GAUGUIN Cm ALISOVIEJO CA 926563875 2344268 ORANGE AVE APT 3 LAKE FOREST CA 926304886 361449 S EL C AMINO REAL APT 4 SAN CLEMENTE CA 926723451 235521 EL REPOSA ALISOVIEJO CA 926561109 106556 CAIXE DEL CERRO UNIT 1414 SAN CLEMENTE CA 926726075 204475 S BIRCH ST SANTA ANA CA 927072703 125542 HALLADAY ST SANTA ANA CA 927071407 103319 W MYRTLE ST SANTA ANA CA 927033911 3342 ENTERPRISE APT 4106 ALISOVIEJO CA 926567091 6100EDINGER AVE APT 326 HUNTJNGTNBCH CA 926473266 As you may have noticed not only there are a lot of errors on the spelling but also city names sometimes come in two words i.e. Santa Ana or ALISOVIEJO which should be "Aliso Viejo" and as you see sometimes "6100 Edinger" comes like "6100EDINGER" I search for "Split Address" on the Forum and I did not see anything that can tackle such a complicated situation... Splitting each Word into a separate column does not work because sometimes city names are two word as I explained above. I know I need to do some kind of hand cleaning before/after I run any special macro that can handle such a task but at this time I am kind of baffled how to this huge task... I came up with idea that if I scan the address column and take the last piece of character with a leading and trailing space i.e. 926473266 and stick it into Zip column and prior leading/trailing space as State and stick it into State Column then I have one more to strip of the Address strings and that would be City name which unfortunately because I have miss spelling and sometimes two word as city name I have problem€¦ For city name sometimes I have to pick up two words prior to State and Zip sometimes only one word and sometime I have spelling problem€¦ On line one of the sample address above for example 927042403 can be picked up and stick into Zip code column, CA can be picked up and stick into State Column but there is no city name by the name of ANA in California so I have to pick up Word before ANA too which is SANTA ANA and stick it into city name column and for the last example address line HUNTJNGTNBCH has spelling problem along with two word city name and the correct name is Huntington Beach or Bch€¦. A macro that can look up the zip code and give me a spelling suggestion or correct city name would be awesome. Honestly programming such a complicated macro is way above my knowledge of VBA programming strengths. I thought if I posted it somebody may give me some suggestion or there is an already Add-in macro exist etc to solve such a problem€¦ Regards -- Jeff B Paarsa |
Spliting rather mixed up address
On 19 Giu, 23:42, Jeffery B Paarsa wrote:
Any code or macro already avilable anywhere to download or purchase? -- Jeff B Paarsa Hi Jeff, I believe that you problem have not an easy solution; however try: First split your data using Text in Column Then try to construct your DB with this macro: Option Explicit Private Sub Worksheet_BeforeDoubleClick(ByVal _ Target As Range, Cancel As Boolean) Cancel = True Dim C As Long, R As Long, X As Long C = Target.Column R = Target.Row C = C - 1 For X = C To 1 Step -1 If Cells(R, X).Value = "" Then Else Target.Value = Cells(R, X).Value & " " & Target.Value Cells(R, X).Value = "" Exit Sub End If Next End Sub I try to explain the routine also if my english is very poor.:-)) By your sample, text in column will arrange your data in the first 12 columns, that is from column A to L. Effect the doubleclic on the column N and that column will contain the number I supose to be the Zip Code. (Old zip code is blanked) Doubleclic on column M = State (old state is blanked) Doubleclic on column L = City (old city is blanked) If the city is written as SANTA ANA the doubleclic on the same column L, that now contain ANA = SANTA ANA an so on, Continue up to column B if necessary. Regards Eliano |
Spliting rather mixed up address
Hi Jeff. I forget: for the error in writing, here in Italy we have not the "crystall ball" and you had to provide manually. Eliano |
Spliting rather mixed up address
On Thu, 18 Jun 2009 11:51:02 -0700, Jeffery B Paarsa
wrote: Hello, On an spread sheet on a single column I receive rather mixed up address that I need to separate the street address into one column and city in another and state into 3rd column and off course zip on 4th column... Here is some off the address I receive: 1295 GAUGUIN CIR ALISOVIEJO CA 926563875 93316 S ARAPAHO DR SANTA ANA CA 927042403 9345 GAUGUIN Cm ALISOVIEJO CA 926563875 2344268 ORANGE AVE APT 3 LAKE FOREST CA 926304886 361449 S EL C AMINO REAL APT 4 SAN CLEMENTE CA 926723451 235521 EL REPOSA ALISOVIEJO CA 926561109 106556 CAIXE DEL CERRO UNIT 1414 SAN CLEMENTE CA 926726075 204475 S BIRCH ST SANTA ANA CA 927072703 125542 HALLADAY ST SANTA ANA CA 927071407 103319 W MYRTLE ST SANTA ANA CA 927033911 3342 ENTERPRISE APT 4106 ALISOVIEJO CA 926567091 6100EDINGER AVE APT 326 HUNTJNGTNBCH CA 926473266 As you may have noticed not only there are a lot of errors on the spelling but also city names sometimes come in two words i.e. Santa Ana or ALISOVIEJO which should be "Aliso Viejo" and as you see sometimes "6100 Edinger" comes like "6100EDINGER" I search for "Split Address" on the Forum and I did not see anything that can tackle such a complicated situation... Splitting each Word into a separate column does not work because sometimes city names are two word as I explained above. I know I need to do some kind of hand cleaning before/after I run any special macro that can handle such a task but at this time I am kind of baffled how to this huge task... I came up with idea that if I scan the address column and take the last piece of character with a leading and trailing space i.e. 926473266 and stick it into Zip column and prior leading/trailing space as State and stick it into State Column then I have one more to strip of the Address strings and that would be City name which unfortunately because I have miss spelling and sometimes two word as city name I have problem… For city name sometimes I have to pick up two words prior to State and Zip sometimes only one word and sometime I have spelling problem… On line one of the sample address above for example 927042403 can be picked up and stick into Zip code column, CA can be picked up and stick into State Column but there is no city name by the name of ANA in California so I have to pick up Word before ANA too which is SANTA ANA and stick it into city name column and for the last example address line HUNTJNGTNBCH has spelling problem along with two word city name and the correct name is Huntington Beach or Bch…. A macro that can look up the zip code and give me a spelling suggestion or correct city name would be awesome. Honestly programming such a complicated macro is way above my knowledge of VBA programming strengths. I thought if I posted it somebody may give me some suggestion or there is an already Add-in macro exist etc to solve such a problem… Regards Here's some code that seems to work on your presented data. It is slow because it is doing its zip code lookup at the USPS site. If this is going to be a frequent occurrence for you, with many entries, you would be better off purchasing a zip code database and keeping it on your machine. But it'll give you an idea of what might be doable. Be sure to read the notes in the code. And also look at the set rng command which is just one way of setting up the range to be processed. There are other methods than hard-coding it. Also note that the code clears out the five columns to the left of the data, so as to allow space for the room. Not knowing how much further you want to go with this, I left out stuff like checking for invalid zip codes (or no zip codes). That should probably be added on depending on your specific requirements. Enjoy. To enter this Macro (Sub), <alt-F11 opens the Visual Basic Editor. Ensure your project is highlighted in the Project Explorer window. Then, from the top menu, select Insert/Module and paste the code below into the window that opens. To use this Macro (Sub), <alt-F8 opens the macro dialog box. Select the macro by name, and <RUN. ======================================== '*** IMPORTANT NOTE *** 'MUST SET REFERENCE (TOOLS/REFERENCES IN MAIN MENU BAR) 'TO: ' Microsoft VBScript Regular Expressions 5.5 ' Microsoft Internet Controls ' In Excel 2007, this is called "Microsoft Browser Helpers" Option Explicit Sub ParseAdr() Dim c As Range, rng As Range Dim s As String Dim ZIP9 As String, ZIP5 As String Dim sCity As String, sState As String Dim sStreet As String, sAddrNum As String Dim sTemp Dim i As Long 'just an example of one method of selecting a range ' to process Set rng = Range("a2:a13") 'clear adjacent cells of old data Set c = rng.Offset(0, 1).Resize(rng.Rows.Count, 5) c.Clear 'process each entry For Each c In rng s = Application.WorksheetFunction.Trim(c.Value) ZIP9 = RegexMid(s, "\d+$") ZIP5 = Left(ZIP9, 5) c.Offset(0, 5).Value = Val(ZIP9) c.Offset(0, 5).NumberFormat = "[<100000]00000;00000-0000" If ZIP5 = "" Then Debug.Print c.Address, c.Value Exit Sub End If sTemp = RevZip(ZIP5) sCity = sTemp(0, 0) sState = sTemp(1, 0) For i = 1 To UBound(RevZip(ZIP5), 2) 'check for which city in the zipcode might match. 'if none match, use the primary city 'remove spaces before comparison to allow for that error in the data If InStr(1, Replace(s, " ", ""), Replace(sTemp(0, i), " ", ""), _ vbTextCompare) 0 Then sCity = sTemp(0, i) sState = sTemp(1, i) Exit For End If Next i c.Offset(0, 3).Value = sCity c.Offset(0, 4).Value = sState sAddrNum = RegexMid(s, "^\d+") c.Offset(0, 1).Value = sAddrNum sStreet = RegexMid(s, sAddrNum & "\s?(.*?)\s?(" & Left(sCity, 7) & "|" _ & Left(Replace(sCity, " ", ""), 7) & ")", , 1) c.Offset(0, 2).Value = sStreet Next c End Sub Private Function RevZip(sZip5 As String) As Variant 'returns 2D array of each city/state pair 'in the zip code Dim IE As InternetExplorer Const sURL As String = "http://zip4.usps.com/zip4/citytown_zip.jsp" Dim sHTML As String Dim sTemp() As String Dim i As Long ' Group2 = City Group3=State IGNORE CASE Const rePattern As String = "headers=pre(<b)?([^,]+),\s([^<]+)" Dim lNumCities As Long Application.Cursor = xlWait Set IE = New InternetExplorer IE.Navigate sURL IE.Visible = False Do While IE.ReadyState < READYSTATE_COMPLETE DoEvents Loop Do While IE.Busy = True DoEvents Loop IE.Document.all("zip5").Value = sZip5 IE.Document.all("Submit").Click Do While IE.ReadyState < READYSTATE_COMPLETE DoEvents Loop Do While IE.Busy = True DoEvents Loop sHTML = IE.Document.body.innerhtml IE.Quit Application.Cursor = xlDefault 'Note that the USPS site can return multiple 'cities for each zip code. So we need to 'return them all lNumCities = RegexCount(sHTML, rePattern) ReDim sTemp(0 To 1, 0 To lNumCities - 1) For i = 0 To lNumCities - 1 sTemp(0, i) = RegexMid(sHTML, rePattern, i + 1, 2) sTemp(1, i) = RegexMid(sHTML, rePattern, i + 1, 3) Next i RevZip = sTemp End Function Private Function RegexMid(s As String, sPat As String, _ Optional Index As Long = 1, _ Optional Subindex As Long, _ Optional CaseIgnore As Boolean = True, _ Optional Glbl As Boolean = True, _ Optional Multiline As Boolean = False) As String Dim re As Object, mc As Object Dim i As Long Set re = CreateObject("vbscript.regexp") re.Pattern = sPat re.IgnoreCase = CaseIgnore re.Global = Glbl re.Multiline = Multiline If re.test(s) = True Then Set mc = re.Execute(s) If Subindex = 0 Then RegexMid = mc(Index - 1) ElseIf Subindex <= mc(Index - 1).SubMatches.Count Then RegexMid = mc(Index - 1).SubMatches(Subindex - 1) End If End If Set re = Nothing End Function Private Function RegexCount(s As String, sPat As String) As Long Dim re As RegExp, mc As MatchCollection Set re = New RegExp re.Pattern = sPat re.Global = True re.IgnoreCase = True Set mc = re.Execute(s) RegexCount = mc.Count Set re = Nothing End Function ============================= --ron |
Spliting rather mixed up address
I tried to run this Module and I got error on Dim IE As InternetExplorer on Private Function RevZip(sZip5 As String) Sounds like one can not define a viriable as InternetExplorer there is no such a definition and I tried Application and I could not get it to work either... Regards Jeff P. "Ron Rosenfeld" wrote: On Thu, 18 Jun 2009 11:51:02 -0700, Jeffery B Paarsa wrote: Hello, On an spread sheet on a single column I receive rather mixed up address that I need to separate the street address into one column and city in another and state into 3rd column and off course zip on 4th column... Here is some off the address I receive: 1295 GAUGUIN CIR ALISOVIEJO CA 926563875 93316 S ARAPAHO DR SANTA ANA CA 927042403 9345 GAUGUIN Cm ALISOVIEJO CA 926563875 2344268 ORANGE AVE APT 3 LAKE FOREST CA 926304886 361449 S EL C AMINO REAL APT 4 SAN CLEMENTE CA 926723451 235521 EL REPOSA ALISOVIEJO CA 926561109 106556 CAIXE DEL CERRO UNIT 1414 SAN CLEMENTE CA 926726075 204475 S BIRCH ST SANTA ANA CA 927072703 125542 HALLADAY ST SANTA ANA CA 927071407 103319 W MYRTLE ST SANTA ANA CA 927033911 3342 ENTERPRISE APT 4106 ALISOVIEJO CA 926567091 6100EDINGER AVE APT 326 HUNTJNGTNBCH CA 926473266 As you may have noticed not only there are a lot of errors on the spelling but also city names sometimes come in two words i.e. Santa Ana or ALISOVIEJO which should be "Aliso Viejo" and as you see sometimes "6100 Edinger" comes like "6100EDINGER" I search for "Split Address" on the Forum and I did not see anything that can tackle such a complicated situation... Splitting each Word into a separate column does not work because sometimes city names are two word as I explained above. I know I need to do some kind of hand cleaning before/after I run any special macro that can handle such a task but at this time I am kind of baffled how to this huge task... I came up with idea that if I scan the address column and take the last piece of character with a leading and trailing space i.e. 926473266 and stick it into Zip column and prior leading/trailing space as State and stick it into State Column then I have one more to strip of the Address strings and that would be City name which unfortunately because I have miss spelling and sometimes two word as city name I have problem€¦ For city name sometimes I have to pick up two words prior to State and Zip sometimes only one word and sometime I have spelling problem€¦ On line one of the sample address above for example 927042403 can be picked up and stick into Zip code column, CA can be picked up and stick into State Column but there is no city name by the name of ANA in California so I have to pick up Word before ANA too which is SANTA ANA and stick it into city name column and for the last example address line HUNTJNGTNBCH has spelling problem along with two word city name and the correct name is Huntington Beach or Bch€¦. A macro that can look up the zip code and give me a spelling suggestion or correct city name would be awesome. Honestly programming such a complicated macro is way above my knowledge of VBA programming strengths. I thought if I posted it somebody may give me some suggestion or there is an already Add-in macro exist etc to solve such a problem€¦ Regards Here's some code that seems to work on your presented data. It is slow because it is doing its zip code lookup at the USPS site. If this is going to be a frequent occurrence for you, with many entries, you would be better off purchasing a zip code database and keeping it on your machine. But it'll give you an idea of what might be doable. Be sure to read the notes in the code. And also look at the set rng command which is just one way of setting up the range to be processed. There are other methods than hard-coding it. Also note that the code clears out the five columns to the left of the data, so as to allow space for the room. Not knowing how much further you want to go with this, I left out stuff like checking for invalid zip codes (or no zip codes). That should probably be added on depending on your specific requirements. Enjoy. To enter this Macro (Sub), <alt-F11 opens the Visual Basic Editor. Ensure your project is highlighted in the Project Explorer window. Then, from the top menu, select Insert/Module and paste the code below into the window that opens. To use this Macro (Sub), <alt-F8 opens the macro dialog box. Select the macro by name, and <RUN. ======================================== '*** IMPORTANT NOTE *** 'MUST SET REFERENCE (TOOLS/REFERENCES IN MAIN MENU BAR) 'TO: ' Microsoft VBScript Regular Expressions 5.5 ' Microsoft Internet Controls ' In Excel 2007, this is called "Microsoft Browser Helpers" Option Explicit Sub ParseAdr() Dim c As Range, rng As Range Dim s As String Dim ZIP9 As String, ZIP5 As String Dim sCity As String, sState As String Dim sStreet As String, sAddrNum As String Dim sTemp Dim i As Long 'just an example of one method of selecting a range ' to process Set rng = Range("a2:a13") 'clear adjacent cells of old data Set c = rng.Offset(0, 1).Resize(rng.Rows.Count, 5) c.Clear 'process each entry For Each c In rng s = Application.WorksheetFunction.Trim(c.Value) ZIP9 = RegexMid(s, "\d+$") ZIP5 = Left(ZIP9, 5) c.Offset(0, 5).Value = Val(ZIP9) c.Offset(0, 5).NumberFormat = "[<100000]00000;00000-0000" If ZIP5 = "" Then Debug.Print c.Address, c.Value Exit Sub End If sTemp = RevZip(ZIP5) sCity = sTemp(0, 0) sState = sTemp(1, 0) For i = 1 To UBound(RevZip(ZIP5), 2) 'check for which city in the zipcode might match. 'if none match, use the primary city 'remove spaces before comparison to allow for that error in the data If InStr(1, Replace(s, " ", ""), Replace(sTemp(0, i), " ", ""), _ vbTextCompare) 0 Then sCity = sTemp(0, i) sState = sTemp(1, i) Exit For End If Next i c.Offset(0, 3).Value = sCity c.Offset(0, 4).Value = sState sAddrNum = RegexMid(s, "^\d+") c.Offset(0, 1).Value = sAddrNum sStreet = RegexMid(s, sAddrNum & "\s?(.*?)\s?(" & Left(sCity, 7) & "|" _ & Left(Replace(sCity, " ", ""), 7) & ")", , 1) c.Offset(0, 2).Value = sStreet Next c End Sub Private Function RevZip(sZip5 As String) As Variant 'returns 2D array of each city/state pair 'in the zip code Dim IE As InternetExplorer Const sURL As String = "http://zip4.usps.com/zip4/citytown_zip.jsp" Dim sHTML As String Dim sTemp() As String Dim i As Long ' Group2 = City Group3=State IGNORE CASE Const rePattern As String = "headers=pre(<b)?([^,]+),\s([^<]+)" Dim lNumCities As Long Application.Cursor = xlWait Set IE = New InternetExplorer IE.Navigate sURL IE.Visible = False Do While IE.ReadyState < READYSTATE_COMPLETE DoEvents Loop Do While IE.Busy = True DoEvents Loop IE.Document.all("zip5").Value = sZip5 IE.Document.all("Submit").Click Do While IE.ReadyState < READYSTATE_COMPLETE DoEvents Loop Do While IE.Busy = True DoEvents Loop sHTML = IE.Document.body.innerhtml IE.Quit Application.Cursor = xlDefault 'Note that the USPS site can return multiple 'cities for each zip code. So we need to 'return them all lNumCities = RegexCount(sHTML, rePattern) ReDim sTemp(0 To 1, 0 To lNumCities - 1) For i = 0 To lNumCities - 1 sTemp(0, i) = RegexMid(sHTML, rePattern, i + 1, 2) sTemp(1, i) = RegexMid(sHTML, rePattern, i + 1, 3) Next i RevZip = sTemp End Function Private Function RegexMid(s As String, sPat As String, _ Optional Index As Long = 1, _ Optional Subindex As Long, _ Optional CaseIgnore As Boolean = True, _ Optional Glbl As Boolean = True, _ Optional Multiline As Boolean = False) As String Dim re As Object, mc As Object Dim i As Long Set re = CreateObject("vbscript.regexp") re.Pattern = sPat re.IgnoreCase = CaseIgnore re.Global = Glbl re.Multiline = Multiline If re.test(s) = True Then Set mc = re.Execute(s) If Subindex = 0 Then RegexMid = mc(Index - 1) ElseIf Subindex <= mc(Index - 1).SubMatches.Count Then RegexMid = mc(Index - 1).SubMatches(Subindex - 1) End If End If Set re = Nothing End Function Private Function RegexCount(s As String, sPat As String) As Long Dim re As RegExp, mc As MatchCollection Set re = New RegExp re.Pattern = sPat re.Global = True re.IgnoreCase = True Set mc = re.Execute(s) RegexCount = mc.Count Set re = Nothing End Function ============================= --ron |
All times are GMT +1. The time now is 05:58 PM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com