Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
How can I use VBA to find a pattern such as ‘Checked Out To:’ and
everything between the quotes? The Excel file will have 16 characters, 20 with the .xls extension. Then there is a Return. Then ‘Checked Out To:’ with last name comma first name and there MAY be some brackets after that. Everything will ALWAYS be between quotes. “9900010X10G00904.xls Checked Out To: Jupree, Jasper [C-TO]” “9900010X10G11208.xls Checked Out To: Malfree, Jim” I’m experimenting with a concept such as this: If InStr(1, my_var, "Checked Out To: " & activeWB = "Checked Out To: " & activeWB, vbTextCompare) 0 Then ‘Something . . . MsgBox () End If There is a pattern, as described above, but I’m not sure how to parse everything between the quotes. Just so you know, all data comes from a URL; the my_var variable comes from this URL . . . all of this works fine! (Thanks Ron . . . ) Any ideas on how to handle this? Thanks!! |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
On Aug 18, 2:28*pm, ryguy7272 wrote:
How can I use VBA to find a pattern such as ‘Checked Out To:’ and everything between the quotes? The Excel file will have 16 characters, 20 with the .xls extension. Then there is a Return. *Then ‘Checked Out To:’ with last name comma first name and there MAY be some brackets after that. *Everything will ALWAYS be between quotes. “9900010X10G00904.xls Checked Out To: Jupree, Jasper [C-TO]” “9900010X10G11208.xls Checked Out To: Malfree, Jim” I’m experimenting with a concept such as this: If InStr(1, my_var, "Checked Out To: " & activeWB = "Checked Out To: " & activeWB, vbTextCompare) 0 Then ‘Something . . . MsgBox () End If There is a pattern, as described above, but I’m not sure how to parse everything between the quotes. Just so you know, all data comes from a URL; the my_var variable comes from this URL . . . all of this works fine! (Thanks Ron . . . ) Any ideas on how to handle this? Thanks!! How about something like... pos_1 = instr(1, my_var, "Checked Out To", vbTextCompare) Do until pos_1 = 0 pos_2 = instrrev(my_var, "xls", pos_1, vbTextCompare) pos_3 = instr(pos_1, my_var, """", vbTextCompare) my_text = mid(my_var, -17 + pos_2, pos_3 - (-17 + pos_2)) ' might need a line like ' my_text=replace(my_text, chr(13),"") ' in order to remove line breaks pos_1 = instr(pos_3, my_var, "Checked Out To", vbTextCompare) Loop ....Ron |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
On Wed, 18 Aug 2010 13:28:40 -0700 (PDT), ryguy7272
wrote: How can I use VBA to find a pattern such as ‘Checked Out To:’ and everything between the quotes? The Excel file will have 16 characters, 20 with the .xls extension. Then there is a Return. Then ‘Checked Out To:’ with last name comma first name and there MAY be some brackets after that. Everything will ALWAYS be between quotes. “9900010X10G00904.xls Checked Out To: Jupree, Jasper [C-TO]” “9900010X10G11208.xls Checked Out To: Malfree, Jim” I’m experimenting with a concept such as this: If InStr(1, my_var, "Checked Out To: " & activeWB = "Checked Out To: " & activeWB, vbTextCompare) 0 Then ‘Something . . . MsgBox () End If There is a pattern, as described above, but I’m not sure how to parse everything between the quotes. Just so you know, all data comes from a URL; the my_var variable comes from this URL . . . all of this works fine! (Thanks Ron . . . ) Any ideas on how to handle this? Thanks!! It's not clear to me, from what you write, exactly what is in my_var, and exactly what result you want returned. The examples you give make it appear as if you just want to return my_var; probably without the quotation marks, provided the sub-string Checked Out To: is present in the string. Is it really that simple? If so, just use Instr to check for the presence of Checked Out To: and, if present, return my_var (or mid(my_var,2, len(my_var)-2 if you need to remove the quotes). A more detailed description and examples of what you want would be helpful. |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
On Aug 18, 9:06*pm, Ron Rosenfeld wrote:
On Wed, 18 Aug 2010 13:28:40 -0700 (PDT), ryguy7272 wrote: How can I use VBA to find a pattern such as ‘Checked Out To:’ and everything between the quotes? The Excel file will have 16 characters, 20 with the .xls extension. Then there is a Return. *Then ‘Checked Out To:’ with last name comma first name and there MAY be some brackets after that. *Everything will ALWAYS be between quotes. “9900010X10G00904.xls Checked Out To: Jupree, Jasper [C-TO]” “9900010X10G11208.xls Checked Out To: Malfree, Jim” I’m experimenting with a concept such as this: If InStr(1, my_var, "Checked Out To: " & activeWB = "Checked Out To: " & activeWB, vbTextCompare) 0 Then ‘Something . . . MsgBox () End If There is a pattern, as described above, but I’m not sure how to parse everything between the quotes. Just so you know, all data comes from a URL; the my_var variable comes from this URL . . . all of this works fine! (Thanks Ron . . . ) Any ideas on how to handle this? Thanks!! It's not clear to me, from what you write, exactly what is in my_var, and exactly what result you want returned. *The examples you give make it appear as if you just want to return my_var; probably without the quotation marks, provided the sub-string Checked Out To: is present in the string. Is it really that simple? If so, just use Instr to check for the presence of Checked Out To: and, if present, return my_var (or mid(my_var,2, len(my_var)-2 if you need to remove the quotes). A more detailed description and examples of what you want would be helpful.- Hide quoted text - - Show quoted text - Thanks guys!! I can't use a loop; it will be too slow. I was suing a loop before and had to rethink this due to performance issues. Here’s what I’m trying to do: URL = "https:// . . . " < -- secure web site Set IE = Nothing Set IE = CreateObject("InternetExplorer.Application") With IE .navigate URL .Visible = False .Top = 50 .Left = 50 .Height = 50 .Width = 50 'Wait for page to load While .Busy Or .readyState < 4 Or IE.Busy = True: Wend Set HTMLdoc = .document End With Dim my_var As String Dim my_file As String Dim my_user As String my_var = "" my_var = IE.document.body.innerHTML Now, my_var is a HUGE string. It is supposed to have everything from the URL. If I go to View Source on my web browser, well, that’s the ‘my_var’ Your code gave the this: ‘javascript:return WebForm_OnSubmit();’ I can’t figure out why. It’s in the site, close to the top, but it has nothing to do with what I’m looking for. I’m looking for something like this: 8834310G10X09999.xls Checked Out To: COLE, TIMMY Basically, this is the text ‘Checked Out To:’ and the name of the Excel file. The name of the Excel file comes from he activeWB = strFullString & ".xls" I need to think about this a bit. If you come up with something else in the meantime, please do post back. Thanks!! |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
On Fri, 20 Aug 2010 12:44:26 -0700 (PDT), ryguy7272
wrote: Now, my_var is a HUGE string. It is supposed to have everything from the URL. If I go to View Source on my web browser, well, that’s the ‘my_var’ Post from a few lines before, through a few lines after, the segment that you wish to extract. |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
On Aug 20, 4:20*pm, Ron Rosenfeld wrote:
On Fri, 20 Aug 2010 12:44:26 -0700 (PDT), ryguy7272 wrote: Now, my_var is a HUGE string. *It is supposed to have everything from the URL. *If I go to View Source on my web browser, well, that’s the ‘my_var’ Post from a few lines before, through a few lines after, the segment that you wish to extract. This is pretty much what I’m searching for: Lines Befo <tr class="ms-alternating" <td class="ms-vb-icon" <a tabIndex="-1" onclick="return DispEx(this,event,'TRUE','FALSE','TRUE','SharePoin t.OpenDocuments. 3','0','SharePoint.OpenDocuments','','','11','11', '0','1','0x400001f07fff1bff')" href="/sites/P/SharedDoc/8834310G10X09999.xls" <img title="8834310G10X09999.xls Checked Out To: COLE, TIMMY" alt="8834310G10X09999.xls Checked Out To: COLE, TIMMY" src="/_layouts/images/icxls.gif" border="0" complete="complete"/ Line After: <td height="100%" class="ms-vb-title" I want to extract this, and pop it into a MsgBox: 8834310G10X09999.xls Checked Out To: COLE, TIMMY There’s always a hard return after the .xls One HUGE string should be in he my_var = IE.document.body.innerHTML Then, I want to look for the name of the Excel file and the name of the person, in the 'my_var' Does this make sense? |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
On Fri, 20 Aug 2010 12:44:26 -0700 (PDT), ryguy7272
wrote: I’m looking for something like this: 8834310G10X09999.xls Checked Out To: COLE, TIMMY Basically, this is the text ‘Checked Out To:’ and the name of the Excel file. The name of the Excel file comes from he activeWB = strFullString & ".xls" I need to think about this a bit. If you come up with something else in the meantime, please do post back. Thanks!! As I alluded to in my previous post, in order to locate specific data, you need to be specific in your description; AND your description needs to be unambiguous. In the example you give, you show only two lines Line 1 contains ONLY an excel file name with a .xls suffix Line 2 starts with "Checked Out To:" and ends with a name that is all in capitals, but may include spaces and a comma. Is that how these entries will always appear within my_var? And will they always be on two consecutive lines with nothing else on those lines? Will there always be just a single entry in my_var, or can there be multiple entries and a need to extract all of them? Here's a code snippet that might pull out what you need, if my assumptions are valid, but will return NOTHING if they are not. In this case, my_var is the contents of A3, but you can change that to inner.text or something similar. ============================== Option Explicit Sub GetCheckedOut() Dim re As Object, mc As Object, m As Object Dim my_var As String Const sPat As String = _ "\b\w+\.xls[\s\S]+?Checked Out To:[\sA-Z,]+\b" my_var = Range("$A$3").Text Set re = CreateObject("vbscript.regexp") re.Pattern = sPat re.Global = True re.ignorecase = False If re.test(my_var) = True Then Set mc = re.Execute(my_var) For Each m In mc Debug.Print m Next m End If End Sub ============================= |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
On Fri, 20 Aug 2010 13:32:29 -0700 (PDT), ryguy7272
wrote: On Aug 20, 4:20*pm, Ron Rosenfeld wrote: On Fri, 20 Aug 2010 12:44:26 -0700 (PDT), ryguy7272 wrote: Now, my_var is a HUGE string. *It is supposed to have everything from the URL. *If I go to View Source on my web browser, well, that’s the ‘my_var’ Post from a few lines before, through a few lines after, the segment that you wish to extract. This is pretty much what I’m searching for: Lines Befo <tr class="ms-alternating" <td class="ms-vb-icon" <a tabIndex="-1" onclick="return DispEx(this,event,'TRUE','FALSE','TRUE','SharePoi nt.OpenDocuments. 3','0','SharePoint.OpenDocuments','','','11','11' ,'0','1','0x400001f07fff1bff')" href="/sites/P/SharedDoc/8834310G10X09999.xls" <img title="8834310G10X09999.xls Checked Out To: COLE, TIMMY" alt="8834310G10X09999.xls Checked Out To: COLE, TIMMY" src="/_layouts/images/icxls.gif" border="0" complete="complete"/ Line After: <td height="100%" class="ms-vb-title" I want to extract this, and pop it into a MsgBox: 8834310G10X09999.xls Checked Out To: COLE, TIMMY There’s always a hard return after the .xls One HUGE string should be in he my_var = IE.document.body.innerHTML Then, I want to look for the name of the Excel file and the name of the person, in the 'my_var' Does this make sense? In your example, you have two (2) instances of Checked Out To preceded by an excel file name that ends with a hard return. The following code will return both of them: ==================================== Option Explicit Sub GetCheckedOut() Dim re As Object, mc As Object, m As Object Dim my_var As String, s As String Const sPat As String = _ "\b\w+\.xls[\r\n]Checked Out To:[\sA-Z,]+\b" my_var = Range("$A$3").Text Set re = CreateObject("vbscript.regexp") re.Pattern = sPat re.Global = True re.ignorecase = False If re.test(my_var) = True Then Set mc = re.Execute(my_var) For Each m In mc s = s & vbLf & vbLf & m Next m s = Mid(s, 3) MsgBox (s) End If End Sub ==================================== |
#9
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
On Aug 20, 4:50*pm, Ron Rosenfeld wrote:
On Fri, 20 Aug 2010 13:32:29 -0700 (PDT), ryguy7272 wrote: On Aug 20, 4:20*pm, Ron Rosenfeld wrote: On Fri, 20 Aug 2010 12:44:26 -0700 (PDT), ryguy7272 wrote: Now, my_var is a HUGE string. *It is supposed to have everything from the URL. *If I go to View Source on my web browser, well, that’s the ‘my_var’ Post from a few lines before, through a few lines after, the segment that you wish to extract. This is pretty much what I’m searching for: Lines Befo <tr class="ms-alternating" *<td class="ms-vb-icon" * *<a tabIndex="-1" onclick="return DispEx(this,event,'TRUE','FALSE','TRUE','SharePoi nt.OpenDocuments. 3','0','SharePoint.OpenDocuments','','','11','11' ,'0','1','0x400001f07fff1*bff')" href="/sites/P/SharedDoc/8834310G10X09999.xls" * *<img title="8834310G10X09999.xls Checked Out To: COLE, TIMMY" alt="8834310G10X09999.xls Checked Out To: COLE, TIMMY" src="/_layouts/images/icxls.gif" border="0" complete="complete"/ Line After: <td height="100%" class="ms-vb-title" I want to extract this, and pop it into a MsgBox: 8834310G10X09999.xls Checked Out To: COLE, TIMMY There’s always a hard return after the .xls One HUGE string should be in he my_var = IE.document.body.innerHTML Then, I want to look for the name of the Excel file and the name of the person, in the 'my_var' Does this make sense? In your example, you have two (2) instances of Checked Out To preceded by an excel file name that ends with a hard return. The following code will return both of them: ==================================== Option Explicit Sub GetCheckedOut() * Dim re As Object, mc As Object, m As Object * Dim my_var As String, s As String * Const sPat As String = _ "\b\w+\.xls[\r\n]Checked Out To:[\sA-Z,]+\b" my_var = Range("$A$3").Text Set re = CreateObject("vbscript.regexp") * * re.Pattern = sPat * * re.Global = True * * re.ignorecase = False If re.test(my_var) = True Then * * Set mc = re.Execute(my_var) * * * * For Each m In mc * * * * * * s = s & vbLf & vbLf & m * * * * Next m * * * * s = Mid(s, 3) * * MsgBox (s) End If End Sub ====================================- Hide quoted text - - Show quoted text - Ok, this may be it, or VERY close. However, my_var is NOT Range("$A $3").Text my_var = "" my_var = IE.document.body.innerHTML I set it to a blank first because I saw some weird stuff in there one time. Question: What does this do? If re.test(my_var) = True Then As I step through the code, that condition doesn't seem to go to True, so the code skips to . . . End If . . . and treminates. The Loop won't work; tried it already and there is sooooooooo much stuff in that 'my_var' string and the performace is super-slow. I'm trying something like this: If InStr(1, my_var, activeWB & vbClrf & "Checked Out To: " & UCase(username), vbTextCompare) Then .. . . . . This may do it too: If InStr(1, my_var, activeWB & vbClrf & "Checked Out To: " & activeWB, vbTextCompare) Then Active Workbook is assigned liek this: activeWB = strFullString & ".xls" Can I use this concept. I already know the loop will be too slow . . . . . Thanks for your time and consideration with this, Ron!! |
#10
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
On Fri, 20 Aug 2010 14:12:19 -0700 (PDT), ryguy7272
wrote: Ok, this may be it, or VERY close. However, my_var is NOT Range("$A $3").Text That's irrelevant. You can set up my_var by equating it to anything that has a string. Since I don't have access to your IE.document.body.innerHTML, I put the text you quoted into A3 and set up my_var that way. my_var = "" my_var = IE.document.body.innerHTML I set it to a blank first because I saw some weird stuff in there one time. When you set it to ...innerHTML, it will overwrite whatever was there before. Question: What does this do? If re.test(my_var) = True Then It tests to see if the pattern you described previously exists in my_var As I step through the code, that condition doesn't seem to go to True, That means that the pattern you described -- the before/after lines and the text you want to extract, that you pasted into your previous post -- does not exist in your instance of my_var (or you've somehow not entered sPat properly). so the code skips to . . . End If . . . and treminates. I can't guess what is or is not in your setup of my_var. Since I don't have access to the source, there's no way for me to determine how what you've posted differs from what you have in my_var. If you want, copy and paste the entire my_var (after you have set it equal to ...innerHTML) as plain text into an email, and send to to me at (you'll have to decipher this): MyFirstName AT MyLastNameonline DOT com |
#11
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
In message
s.com of Fri, 20 Aug 2010 14:12:19 in microsoft.public.excel.programmin g, ryguy7272 writes On Aug 20, 4:50*pm, Ron Rosenfeld wrote: On Fri, 20 Aug 2010 13:32:29 -0700 (PDT), ryguy7272 [snip] * Const sPat As String = _ "\b\w+\.xls[\r\n]Checked Out To:[\sA-Z,]+\b" my_var = Range("$A$3").Text Set re = CreateObject("vbscript.regexp") * * re.Pattern = sPat * * re.Global = True * * re.ignorecase = False If re.test(my_var) = True Then [snip] Question: What does this do? If re.test(my_var) = True Then As I step through the code, that condition doesn't seem to go to True, so the code skips to . . . End If . . . and treminates. [snip] Thanks for your time and consideration with this, Ron!! As Ryan (Is it Ryan?) did not understand re.test(my_var), I suspect "\b\w+\.xls[\r\n]Checked Out To:[\sA-Z,]+\b" is also opaque to him/her. cf. <http://en.wikipedia.org/wiki/Regular_expression for a description of Regular Expressions(RE). They are HARD, but POWERFUL. <http://msdn.microsoft.com/en-us/library/ms974570.aspx describes the Microsoft dialect. I use them a lot and they give me much grief until I get them right. They then work well until I find an example which breaks them. One technique in my standard arsenal is the following function: Private Function GetSub(ByVal from As String, ByVal Match As String, ByVal Part As String) As String RE.Pattern = Match GetSub = RE.Replace(from, Part) End Function which I use it from the Immediate window (opened with Ctrl+G) as ?GetSub(my_var, "\b\w+\.xls[\r\n]Checked Out To:[\sA-Z,]+\b", "yes") If the RE matches, this will write yes It may write "fooyesbar". If the RE does not match, it will write nothing or the value of my_var - I can't remember which. A different RE is needed. I would try a complete match of my_var, with codes to skip to the match and from the match. i.e. with ?GetSub(my_var, "^.*\b\w+\.xls[\r\n]Checked Out To:[\sA-Z,]+\b.*$", _ "yes") That won't match. Find something that will. e.g. ?GetSub(my_var, "^.*Checked.*$", "yes") To see what matches, use ?GetSub(my_var, "^.*(Checked).*$", "$1") Do a binary chop between small things which match and large things which don't until the desired match is found. Then apply that pattern to Ron's code. I will ignore "What does this do?" when the answer is RTFM <http://en.wi kipedia.org/wiki/Rtfm I wish Ryan? luck. -- Walter Briscoe |
#12
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
On Sat, 21 Aug 2010 10:49:15 +0100, Walter Briscoe
wrote: As Ryan (Is it Ryan?) did not understand re.test(my_var), I suspect "\b\w+\.xls[\r\n]Checked Out To:[\sA-Z,]+\b" is also opaque to him/her. Given this comment from him: my_var = "" my_var = IE.document.body.innerHTML I set it to a blank first because I saw some weird stuff in there one time I suspect there are other areas about VB programming that Ryan needs to study. I use them a lot and they give me much grief until I get them right. They then work well until I find an example which breaks them. If you aren't aware of it, I would suggest a program called RegexBuddy (www.regexbuddy.com). It is very helpful in trying to devise regular expressions that work. In this case, I merely pasted the data supplied into its Test window, and devised a regex that works on it. Since it did not work on the OP's test, either the data or his implementation must be at variance with what I have here. Hopefully we can track down the differences and correct them. RegexBuddy can also generate program code, as well as give explanations of the regex such as: =================================== "Checked Out To" Regex \b\w+\.xls[\r\n]Checked Out To:[\sA-Z,]+\b Assert position at a word boundary «\b» Match a single character that is a “word character” (letters, digits, and underscores) «\w+» Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+» Match the character “.” literally «\.» Match the characters “xls” literally «xls» Match a single character present in the list below «[\r\n]» A carriage return character «\r» A line feed character «\n» Match the characters “Checked Out To:” literally «Checked Out To:» Match a single character present in the list below «[\sA-Z,]+» Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+» A whitespace character (spaces, tabs, and line breaks) «\s» A character in the range between “A” and “Z” «A-Z» The character “,” «,» Assert position at a word boundary «\b» Created with RegexBuddy ================================================== == |
#13
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
In message of Sat, 21 Aug
2010 06:47:29 in microsoft.public.excel.programming, Ron Rosenfeld writes On Sat, 21 Aug 2010 10:49:15 +0100, Walter Briscoe wrote: As Ryan (Is it Ryan?) did not understand re.test(my_var), I suspect "\b\w+\.xls[\r\n]Checked Out To:[\sA-Z,]+\b" is also opaque to him/her. Given this comment from him: my_var = "" my_var = IE.document.body.innerHTML I set it to a blank first because I saw some weird stuff in there one time I suspect there are other areas about VB programming that Ryan needs to study. I use them a lot and they give me much grief until I get them right. They then work well until I find an example which breaks them. If you aren't aware of it, I would suggest a program called RegexBuddy (www.regexbuddy.com). It is very helpful in trying to devise regular expressions that work. [snip] Thanks for that Ron. I don't think I had heard of RegexBuddy. Looks good. RegexMagic <http://www.regexbuddy.com/regexmagic.html looks even better for generating REs. Ryan might buy you a copy. ;) ~30USD. I am very impressed they seem to support a wide variety of RE dialect. e.g. both the primitive REs of sed and the sophisticated ones of perl. If I was still being paid to develop software, I would buy both. Meanwhile, I have noted the reference. -- Walter Briscoe |
#14
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
On Sat, 21 Aug 2010 12:55:55 +0100, Walter Briscoe
wrote: Thanks for that Ron. I don't think I had heard of RegexBuddy. Looks good. RegexMagic <http://www.regexbuddy.com/regexmagic.html looks even better for generating REs. Ryan might buy you a copy. ;) ~30USD. I am very impressed they seem to support a wide variety of RE dialect. e.g. both the primitive REs of sed and the sophisticated ones of perl. If I was still being paid to develop software, I would buy both. Meanwhile, I have noted the reference. -- Walter Briscoe I'm not so sure about Regex Magic. I've been trying it out a bit, but I don't know it well enough to do much more than generate regex's that will accept only a defined range of numbers. Probably if I became more familiar with it, it would be more useful to me. |
#15
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
On Fri, 20 Aug 2010 14:12:19 -0700 (PDT), ryguy7272
wrote: On Aug 20, 4:50*pm, Ron Rosenfeld wrote: On Fri, 20 Aug 2010 13:32:29 -0700 (PDT), ryguy7272 wrote: On Aug 20, 4:20*pm, Ron Rosenfeld wrote: On Fri, 20 Aug 2010 12:44:26 -0700 (PDT), ryguy7272 wrote: Now, my_var is a HUGE string. *It is supposed to have everything from the URL. *If I go to View Source on my web browser, well, that’s the ‘my_var’ Post from a few lines before, through a few lines after, the segment that you wish to extract. This is pretty much what I’m searching for: Lines Befo <tr class="ms-alternating" *<td class="ms-vb-icon" * *<a tabIndex="-1" onclick="return DispEx(this,event,'TRUE','FALSE','TRUE','SharePoi nt.OpenDocuments. 3','0','SharePoint.OpenDocuments','','','11','11' ,'0','1','0x400001f07fff1*bff')" href="/sites/P/SharedDoc/8834310G10X09999.xls" * *<img title="8834310G10X09999.xls Checked Out To: COLE, TIMMY" alt="8834310G10X09999.xls Checked Out To: COLE, TIMMY" src="/_layouts/images/icxls.gif" border="0" complete="complete"/ Line After: <td height="100%" class="ms-vb-title" I want to extract this, and pop it into a MsgBox: 8834310G10X09999.xls Checked Out To: COLE, TIMMY There’s always a hard return after the .xls One HUGE string should be in he my_var = IE.document.body.innerHTML Then, I want to look for the name of the Excel file and the name of the person, in the 'my_var' Does this make sense? In your example, you have two (2) instances of Checked Out To preceded by an excel file name that ends with a hard return. The following code will return both of them: ==================================== Option Explicit Sub GetCheckedOut() * Dim re As Object, mc As Object, m As Object * Dim my_var As String, s As String * Const sPat As String = _ "\b\w+\.xls[\r\n]Checked Out To:[\sA-Z,]+\b" my_var = Range("$A$3").Text Set re = CreateObject("vbscript.regexp") * * re.Pattern = sPat * * re.Global = True * * re.ignorecase = False If re.test(my_var) = True Then * * Set mc = re.Execute(my_var) * * * * For Each m In mc * * * * * * s = s & vbLf & vbLf & m * * * * Next m * * * * s = Mid(s, 3) * * MsgBox (s) End If End Sub ====================================- Hide quoted text - - Show quoted text - Ok, this may be it, or VERY close. However, my_var is NOT Range("$A $3").Text my_var = "" my_var = IE.document.body.innerHTML I set it to a blank first because I saw some weird stuff in there one time. Question: What does this do? If re.test(my_var) = True Then As I step through the code, that condition doesn't seem to go to True, so the code skips to . . . End If . . . and treminates. The Loop won't work; tried it already and there is sooooooooo much stuff in that 'my_var' string and the performace is super-slow. I'm trying something like this: If InStr(1, my_var, activeWB & vbClrf & "Checked Out To: " & UCase(username), vbTextCompare) Then . . . . . This may do it too: If InStr(1, my_var, activeWB & vbClrf & "Checked Out To: " & activeWB, vbTextCompare) Then Active Workbook is assigned liek this: activeWB = strFullString & ".xls" Can I use this concept. I already know the loop will be too slow . . . . . Thanks for your time and consideration with this, Ron!! Did you get your macro working? I have not received anything from you. |
#16
Posted to microsoft.public.excel.programming
|
|||
|
|||
Find a Pattern and Parse Text
On Aug 21, 5:27*am, Ron Rosenfeld wrote:
On Sat, 21 Aug 2010 12:55:55 +0100, Walter Briscoe wrote: Thanks for that Ron. I don't think I had heard of RegexBuddy. Looks good. RegexMagic <http://www.regexbuddy.com/regexmagic.html looks even better for generating REs. Ryan might buy you a copy. ;) ~30USD. I am very impressed they seem to support a wide variety of RE dialect. e.g. both the primitive REs of sed and the sophisticated ones of perl. If I was still being paid to develop software, I would buy both. Meanwhile, I have noted the reference. -- Walter Briscoe I'm not so sure about Regex Magic. *I've been trying it out a bit, but I don't know it well enough to do much more than generate regex's that will accept only a defined range of numbers. *Probably if I became more familiar with it, it would be more useful to me. Hello everyone! Thanks for all the help with this. I think Regular expressions may be the key to solving this puzzle. I've got a bit more work to do on this. I'll try to get my arms around it by mid week and post back with a solution as soon as possible. Thanks again! Ryan-- |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Where to find pattern fill add-in | Charts and Charting in Excel | |||
Where to find pattern fill add-in | Excel Discussion (Misc queries) | |||
How to find & delete a particular pattern of text from values of a | Excel Discussion (Misc queries) | |||
VBE Find and replace with pattern | Excel Programming | |||
Wish to parse through a text string to find data | Excel Programming |