Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
All,
I have a very useful report I get weekly (text document) that is in a very un-useful format... It gives me information on all my personnel (hundreds of folks) in a "justified" format. I want to import the information I need from each individual into excel (CSV file) so I can manipulate the data or search, etc... In the end, I'll probably use Access - but I still have the same problem. The easiest way I know to grab the information I need is using Regular Expressions to parse through each line of the report. I "use-ta-could" in PERL, or I could teach myself using C#. I really would like to write the code using VBA because I'm starting to feel comfortable with the language and I've created about a dozen routines that I use in Word/Excel/PowerPoint. I'd like to continue becoming more proficient. Any help putting me in the right direction would be greatly appreciated. Rob F |
#2
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
"Rob" wrote...
.... The easiest way I know to grab the information I need is using Regular Expressions to parse through each line of the report. .... I really would like to write the code using VBA because I'm starting to feel comfortable with the language and I've created about a dozen routines that I use in Word/Excel/PowerPoint. I'd like to continue becoming more proficient. .... VBA itself doesn't have regular expressions, but since you mention C#, it's a fair bet you're using Excel under Windows. If so, you almost certainly have Windows Script Host installed, in which case you could use VBScript/VJScript regular expressions. The documentation is available online from MSDN. They used to have it in a downloadable form, but who knows whether that's still available. As for how to use it, see http://groups-beta.google.com/group/...c5c9457939edb5 |
#3
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Hi Rob,
If the file is a text file and it is regularly separated, you should be able to simply open it in Excel. YOu will be taken to various screens and alternatives to decsribe the data layout. I am not sure what you mean by "justified", but Excel and open many formats. Comma, tab, space, etc. You may have to play around with the opening of it to find the right method that fits your data, but I expect it will open and be useful. "Rob" wrote: All, I have a very useful report I get weekly (text document) that is in a very un-useful format... It gives me information on all my personnel (hundreds of folks) in a "justified" format. I want to import the information I need from each individual into excel (CSV file) so I can manipulate the data or search, etc... In the end, I'll probably use Access - but I still have the same problem. The easiest way I know to grab the information I need is using Regular Expressions to parse through each line of the report. I "use-ta-could" in PERL, or I could teach myself using C#. I really would like to write the code using VBA because I'm starting to feel comfortable with the language and I've created about a dozen routines that I use in Word/Excel/PowerPoint. I'd like to continue becoming more proficient. Any help putting me in the right direction would be greatly appreciated. Rob F |
#4
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
David,
When I say "justified," I mean it's a database report that is justified to the screen (fields are arranged to fit the screen -- like a justified form in ACCESS): Name: Joe Blow SSN: 987-65-4321 Home Phone: (850) 555-1234 Start-Date: 12/25/2004 Occupation: Dead beat DOB: 01/01/1986 Education: Graduated from 6th Grade address, etc... With Regular Expressions, I can grab a line, parse the data I want ("Joe Blow", "(850) 555-1234") and insert into a new csv file/database/etc. I used to do it all the time in PERL, but I haven't used the language in 2 - 3 years. I can learn to do it in C#, but then I have the same problem -- I will have to ensure every machine can run PERL or C#, and I will have to train every user (sometimes it's easier to teach a monkey to fly). Can I do anything similar to regular Expressions with VBA? "David" wrote in message ... Hi Rob, If the file is a text file and it is regularly separated, you should be able to simply open it in Excel. YOu will be taken to various screens and alternatives to decsribe the data layout. I am not sure what you mean by "justified", but Excel and open many formats. Comma, tab, space, etc. You may have to play around with the opening of it to find the right method that fits your data, but I expect it will open and be useful. "Rob" wrote: All, I have a very useful report I get weekly (text document) that is in a very un-useful format... It gives me information on all my personnel (hundreds of folks) in a "justified" format. I want to import the information I need from each individual into excel (CSV file) so I can manipulate the data or search, etc... In the end, I'll probably use Access - but I still have the same problem. The easiest way I know to grab the information I need is using Regular Expressions to parse through each line of the report. I "use-ta-could" in PERL, or I could teach myself using C#. I really would like to write the code using VBA because I'm starting to feel comfortable with the language and I've created about a dozen routines that I use in Word/Excel/PowerPoint. I'd like to continue becoming more proficient. Any help putting me in the right direction would be greatly appreciated. Rob F |
#5
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Hi Rob,
You get an electronic form of this file? When I look at what you have below, this all comes on one line? A record per line? What "type" of file, what is the file extension? Have you attempted to open this file with Excel? I am not talking about how is appears on a computer screen, but form the file is is in, prior to it's viewing. Really, try abd open it with Excel and see what happens. If it is what appears below, I think you will be able to open it. Again you may have to play with it and attempt this more than once to get what you want. The process of opening it will "parse" it for you. This does not require and VB code. YOu will taken through several "Screens" called the Text Import Wizard", which allows you to describe the format of the text file you are trying to open in Excel. You will not need to parse it at this time, but simply put it in a form that is readable and useful. When you have it in this form, you can delete whole columns, which will represent fields of data. Try this and see if it will open for you. In Excel - File/Open, make sure that you tell Excel to look for "All Types" in the Open dialog or it will not recogize anything but Excel files. "Rob" wrote: David, When I say "justified," I mean it's a database report that is justified to the screen (fields are arranged to fit the screen -- like a justified form in ACCESS): Name: Joe Blow SSN: 987-65-4321 Home Phone: (850) 555-1234 Start-Date: 12/25/2004 Occupation: Dead beat DOB: 01/01/1986 Education: Graduated from 6th Grade address, etc... With Regular Expressions, I can grab a line, parse the data I want ("Joe Blow", "(850) 555-1234") and insert into a new csv file/database/etc. I used to do it all the time in PERL, but I haven't used the language in 2 - 3 years. I can learn to do it in C#, but then I have the same problem -- I will have to ensure every machine can run PERL or C#, and I will have to train every user (sometimes it's easier to teach a monkey to fly). Can I do anything similar to regular Expressions with VBA? "David" wrote in message ... Hi Rob, If the file is a text file and it is regularly separated, you should be able to simply open it in Excel. YOu will be taken to various screens and alternatives to decsribe the data layout. I am not sure what you mean by "justified", but Excel and open many formats. Comma, tab, space, etc. You may have to play around with the opening of it to find the right method that fits your data, but I expect it will open and be useful. "Rob" wrote: All, I have a very useful report I get weekly (text document) that is in a very un-useful format... It gives me information on all my personnel (hundreds of folks) in a "justified" format. I want to import the information I need from each individual into excel (CSV file) so I can manipulate the data or search, etc... In the end, I'll probably use Access - but I still have the same problem. The easiest way I know to grab the information I need is using Regular Expressions to parse through each line of the report. I "use-ta-could" in PERL, or I could teach myself using C#. I really would like to write the code using VBA because I'm starting to feel comfortable with the language and I've created about a dozen routines that I use in Word/Excel/PowerPoint. I'd like to continue becoming more proficient. Any help putting me in the right direction would be greatly appreciated. Rob F |
#6
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
David,
I gave "Importing data a try, but it is not powerfull enough to do what I needed. One record consists of anywhere from 29 to 35 fields (The last couple of fields are normally different on every record depending on what information is flagged by the sender. Like I said, I can do this very easily in PERL, but I was hoping to do so in VBA so It's easy for the folks I built it for to use. "David" wrote in message ... Hi Rob, You get an electronic form of this file? When I look at what you have below, this all comes on one line? A record per line? What "type" of file, what is the file extension? Have you attempted to open this file with Excel? I am not talking about how is appears on a computer screen, but form the file is is in, prior to it's viewing. Really, try abd open it with Excel and see what happens. If it is what appears below, I think you will be able to open it. Again you may have to play with it and attempt this more than once to get what you want. The process of opening it will "parse" it for you. This does not require and VB code. YOu will taken through several "Screens" called the Text Import Wizard", which allows you to describe the format of the text file you are trying to open in Excel. You will not need to parse it at this time, but simply put it in a form that is readable and useful. When you have it in this form, you can delete whole columns, which will represent fields of data. Try this and see if it will open for you. In Excel - File/Open, make sure that you tell Excel to look for "All Types" in the Open dialog or it will not recogize anything but Excel files. "Rob" wrote: David, When I say "justified," I mean it's a database report that is justified to the screen (fields are arranged to fit the screen -- like a justified form in ACCESS): Name: Joe Blow SSN: 987-65-4321 Home Phone: (850) 555-1234 Start-Date: 12/25/2004 Occupation: Dead beat DOB: 01/01/1986 Education: Graduated from 6th Grade address, etc... With Regular Expressions, I can grab a line, parse the data I want ("Joe Blow", "(850) 555-1234") and insert into a new csv file/database/etc. I used to do it all the time in PERL, but I haven't used the language in 2 - 3 years. I can learn to do it in C#, but then I have the same problem -- I will have to ensure every machine can run PERL or C#, and I will have to train every user (sometimes it's easier to teach a monkey to fly). Can I do anything similar to regular Expressions with VBA? "David" wrote in message ... Hi Rob, If the file is a text file and it is regularly separated, you should be able to simply open it in Excel. YOu will be taken to various screens and alternatives to decsribe the data layout. I am not sure what you mean by "justified", but Excel and open many formats. Comma, tab, space, etc. You may have to play around with the opening of it to find the right method that fits your data, but I expect it will open and be useful. "Rob" wrote: All, I have a very useful report I get weekly (text document) that is in a very un-useful format... It gives me information on all my personnel (hundreds of folks) in a "justified" format. I want to import the information I need from each individual into excel (CSV file) so I can manipulate the data or search, etc... In the end, I'll probably use Access - but I still have the same problem. The easiest way I know to grab the information I need is using Regular Expressions to parse through each line of the report. I "use-ta-could" in PERL, or I could teach myself using C#. I really would like to write the code using VBA because I'm starting to feel comfortable with the language and I've created about a dozen routines that I use in Word/Excel/PowerPoint. I'd like to continue becoming more proficient. Any help putting me in the right direction would be greatly appreciated. Rob F |
#8
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
"Tushar Mehta" wrote...
If the contents of the file are a fixed number of columns, each of a fixed character width, you should follow through on David's suggestion. If you still need Regular Expression support, in addition to the other suggestions, you might also want to check http://www.tmehta.com/regexp/ You have a subtle error in one of your examples in http://www.tmehta.com/regexp/examples.htm specifically, "The pattern that recognizes all characters other than letters, numbers, and the underscore character is [^\w+]." You should use \W+ (upper case W rather than lower case w, representing the Unix mindset that case matters). Your regexp is actually an error because it doesn't strip out + characters, and your claim about what the + char does in your regexp is wrong because closure operators don't apply within character classes. Just try =RegExpSubstitute("a+1+_","[^\w+]","") and you'll see the result is "a+1+_". Inside character classes, *, + and ? are treated as literal characters, so your regexp is the complement of the SINGLE CHARACTER class made up of 'word' (token) characters *AND* +. So your regexp also does a lot more work than it should. While you could use [^\w]+, that's foolish compared to \W+. You have another subtle error in "The sub-expression ((\d|.)+) isolates the number, which may contain decimal values, and the next sub-expression (CR) isolates the CR." Your subexpression matches any string of one or more periods. The canonical way to express unsigned decimal numbers possibly with decimal fractional parts is (\d*\.)?\d+ While your subexpression may work on your test data and may work most of the time, it doesn't take much to make it work *ALL* of the time. You need to invest some money buying Jeffrey Friedl's "Mastering Regular Expressions" and some time reading and understanding it. You may be an Excel MVP, but you're a regexp newbie. If you think I'm harsh, try learning how to use regexps for real by posting replies in comp.unix.shell, comp.lang.awk and comp.lang.awk, where the regexp experts hang out. |
#9
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
use regexps for real by posting replies in comp.unix.shell, comp.lang.awk
and comp.lang.awk, where the regexp experts hang out. Make that comp.unix.shell, comp.lang.awk and comp.lang.perl.misc. |
#10
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Thank you for your comments. Rest assured that both the technical
comments and those of a more personal nature will get all the respect they deserve. -- Regards, Tushar Mehta www.tushar-mehta.com Excel, PowerPoint, and VBA add-ins, tutorials Custom MS Office productivity solutions In article , says... "Tushar Mehta" wrote... If the contents of the file are a fixed number of columns, each of a fixed character width, you should follow through on David's suggestion. If you still need Regular Expression support, in addition to the other suggestions, you might also want to check http://www.tmehta.com/regexp/ You have a subtle error in one of your examples in http://www.tmehta.com/regexp/examples.htm specifically, "The pattern that recognizes all characters other than letters, numbers, and the underscore character is [^\w+]." You should use \W+ (upper case W rather than lower case w, representing the Unix mindset that case matters). Your regexp is actually an error because it doesn't strip out + characters, and your claim about what the + char does in your regexp is wrong because closure operators don't apply within character classes. Just try =RegExpSubstitute("a+1+_","[^\w+]","") and you'll see the result is "a+1+_". Inside character classes, *, + and ? are treated as literal characters, so your regexp is the complement of the SINGLE CHARACTER class made up of 'word' (token) characters *AND* +. So your regexp also does a lot more work than it should. While you could use [^\w]+, that's foolish compared to \W+. You have another subtle error in "The sub-expression ((\d|.)+) isolates the number, which may contain decimal values, and the next sub-expression (CR) isolates the CR." Your subexpression matches any string of one or more periods. The canonical way to express unsigned decimal numbers possibly with decimal fractional parts is (\d*\.)?\d+ While your subexpression may work on your test data and may work most of the time, it doesn't take much to make it work *ALL* of the time. You need to invest some money buying Jeffrey Friedl's "Mastering Regular Expressions" and some time reading and understanding it. You may be an Excel MVP, but you're a regexp newbie. If you think I'm harsh, try learning how to use regexps for real by posting replies in comp.unix.shell, comp.lang.awk and comp.lang.awk, where the regexp experts hang out. |
#11
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On Sat, 05 Feb 2005 22:28:58 GMT, "Rob" wrote in
microsoft.public.excel.programming: [snip] The easiest way I know to grab the information I need is using Regular Expressions to parse through each line of the report. It all starts with Set RegEx = CreateObject("vbscript.regexp") See <http://www.vbaexpress.com/kb/getarticle.php?kb_id=68 and many more for examples. Searching the Internet (web and groups) returns quite a number of useful results. -- Michael Bednarek http://mbednarek.com/ "POST NO BILLS" |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Get rid of with regular expressions | Excel Discussion (Misc queries) | |||
Regular expressions in VB | Excel Discussion (Misc queries) | |||
Regular Expressions & Middle Name | Excel Worksheet Functions | |||
Regular expressions in Excel | Excel Discussion (Misc queries) | |||
VBA and Regular expressions | Excel Programming |