View Single Post
  #15   Report Post  
Posted to microsoft.public.excel.programming
Ron Rosenfeld[_2_] Ron Rosenfeld[_2_] is offline
external usenet poster
 
Posts: 1,045
Default Find a Pattern and Parse Text

On Fri, 20 Aug 2010 14:12:19 -0700 (PDT), ryguy7272
wrote:

On Aug 20, 4:50*pm, Ron Rosenfeld wrote:
On Fri, 20 Aug 2010 13:32:29 -0700 (PDT), ryguy7272





wrote:
On Aug 20, 4:20*pm, Ron Rosenfeld wrote:
On Fri, 20 Aug 2010 12:44:26 -0700 (PDT), ryguy7272


wrote:
Now, my_var is a HUGE string. *It is supposed to have everything from
the URL. *If I go to View Source on my web browser, well, that’s the
‘my_var’


Post from a few lines before, through a few lines after, the segment
that you wish to extract.


This is pretty much what I’m searching for:


Lines Befo
<tr class="ms-alternating"
*<td class="ms-vb-icon"
* *<a tabIndex="-1" onclick="return
DispEx(this,event,'TRUE','FALSE','TRUE','SharePoi nt.OpenDocuments.
3','0','SharePoint.OpenDocuments','','','11','11' ,'0','1','0x400001f07fff1*bff')"
href="/sites/P/SharedDoc/8834310G10X09999.xls"
* *<img title="8834310G10X09999.xls
Checked Out To: COLE, TIMMY" alt="8834310G10X09999.xls
Checked Out To: COLE, TIMMY" src="/_layouts/images/icxls.gif"
border="0" complete="complete"/


Line After:
<td height="100%" class="ms-vb-title"


I want to extract this, and pop it into a MsgBox:
8834310G10X09999.xls
Checked Out To: COLE, TIMMY


There’s always a hard return after the .xls


One HUGE string should be in he
my_var = IE.document.body.innerHTML


Then, I want to look for the name of the Excel file and the name of
the person, in the 'my_var'


Does this make sense?


In your example, you have two (2) instances of Checked Out To preceded
by an excel file name that ends with a hard return.

The following code will return both of them:

====================================
Option Explicit
Sub GetCheckedOut()
* Dim re As Object, mc As Object, m As Object
* Dim my_var As String, s As String
* Const sPat As String = _
"\b\w+\.xls[\r\n]Checked Out To:[\sA-Z,]+\b"

my_var = Range("$A$3").Text
Set re = CreateObject("vbscript.regexp")
* * re.Pattern = sPat
* * re.Global = True
* * re.ignorecase = False

If re.test(my_var) = True Then
* * Set mc = re.Execute(my_var)
* * * * For Each m In mc
* * * * * * s = s & vbLf & vbLf & m
* * * * Next m
* * * * s = Mid(s, 3)
* * MsgBox (s)
End If
End Sub
====================================- Hide quoted text -

- Show quoted text -



Ok, this may be it, or VERY close. However, my_var is NOT Range("$A
$3").Text
my_var = ""
my_var = IE.document.body.innerHTML

I set it to a blank first because I saw some weird stuff in there one
time.

Question: What does this do?
If re.test(my_var) = True Then

As I step through the code, that condition doesn't seem to go to True,
so the code skips to . . . End If . . . and treminates.

The Loop won't work; tried it already and there is sooooooooo much
stuff in that 'my_var' string and the performace is super-slow.

I'm trying something like this:
If InStr(1, my_var, activeWB & vbClrf & "Checked Out To: " &
UCase(username), vbTextCompare) Then
. . . . .

This may do it too:
If InStr(1, my_var, activeWB & vbClrf & "Checked Out To: " & activeWB,
vbTextCompare) Then

Active Workbook is assigned liek this:
activeWB = strFullString & ".xls"

Can I use this concept. I already know the loop will be too
slow . . . . .

Thanks for your time and consideration with this, Ron!!



Did you get your macro working? I have not received anything from
you.