Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Junior Member
 
Posts: 11
Default Find date in excel string previous to last occurence of specific text

I have been combing through multiple forums and excel formula code but cannot find anything to help me out.

I have thousands of strings I need to parse through to extract out a specific date. The date location has no specific parameters around it to easily extract it out. The location from the beginning or the end varies with every string. The only thing that is consistent is a message that reads "Changed Status to Subtasks Created" which can be displayed anywhere from 20 characters to 660 characters after the date we need to extract.

Here is a snippet:
Changed Item Details to CTS Details:
02/15/2013 9:49 AM Escalated Changed Details_1 to CTS Details:
02/15/2013 9:49 AM Escalated Changed Details_2 to
Changed Details_3 to
Added the following Subtasks to this Request: 146448.
Escalated: Rename Master Ticket - NH
Deleted Assignee: UAM
Changed Status to Subtasks Created from Generating Subtasks
Changed Request Title
Changed Initial Approval to None
Changed Additional Approval to No
Changed Ticket Item to
02/15/2013 12:14 PM Escalated Changed Status to Closed from Subtasks Created

So I need to extract then 02/15/2013 9:49 AM text out of this string. The above snippet has 4200 characters in front of what I showed above. Can anyone help?
  #2   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 3,872
Default Find date in excel string previous to last occurence of specific text

Hi,

Am Tue, 23 Jul 2013 20:28:41 +0100 schrieb osuejm1026:

Changed Item Details to CTS Details:
02/15/2013 9:49 AM Escalated Changed Details_1 to CTS Details:
02/15/2013 9:49 AM Escalated Changed Details_2 to
Changed Details_3 to
Added the following Subtasks to this Request: 146448.
Escalated: Rename Master Ticket - NH
Deleted Assignee: UAM
Changed Status to Subtasks Created from Generating Subtasks
Changed Request Title
Changed Initial Approval to None
Changed Additional Approval to No
Changed Ticket Item to
02/15/2013 12:14 PM Escalated Changed Status to Closed from Subtasks
Created


try the function:

Function myDate(myCell As Range) As Date
Dim i As Integer
Dim j As Integer
Dim k As Integer

i = InStr(1, myCell, "Changed Status to Subtasks Created") - 5
j = InStrRev(myCell, "AM", i)
k = InStrRev(myCell, "PM", i)
myDate = Format(Mid(myCell, WorksheetFunction.Max(j, k) _
- 17, 19), "mm/dd/yyyy h:mm AM/PM")
End Function

Your string in A1 then into the sheet:
=myDate(A1)


Regards
Claus B.
--
Win XP PRof SP2 / Vista Ultimate SP2
Office 2003 SP2 /2007 Ultimate SP2
  #3   Report Post  
Junior Member
 
Posts: 11
Default

I enter in your function and when i type in the function in excel, i get a #VALUE! returned. I stepped through the code and it seems to be working, but nothing gets returned. Thoughts?

Quote:
Originally Posted by Claus Busch View Post
Hi,

Am Tue, 23 Jul 2013 20:28:41 +0100 schrieb osuejm1026:

Changed Item Details to CTS Details:
02/15/2013 9:49 AM Escalated Changed Details_1 to CTS Details:
02/15/2013 9:49 AM Escalated Changed Details_2 to
Changed Details_3 to
Added the following Subtasks to this Request: 146448.
Escalated: Rename Master Ticket - NH
Deleted Assignee: UAM
Changed Status to Subtasks Created from Generating Subtasks
Changed Request Title
Changed Initial Approval to None
Changed Additional Approval to No
Changed Ticket Item to
02/15/2013 12:14 PM Escalated Changed Status to Closed from Subtasks
Created


try the function:

Function myDate(myCell As Range) As Date
Dim i As Integer
Dim j As Integer
Dim k As Integer

i = InStr(1, myCell, "Changed Status to Subtasks Created") - 5
j = InStrRev(myCell, "AM", i)
k = InStrRev(myCell, "PM", i)
myDate = Format(Mid(myCell, WorksheetFunction.Max(j, k) _
- 17, 19), "mm/dd/yyyy h:mm AM/PM")
End Function

Your string in A1 then into the sheet:
=myDate(A1)


Regards
Claus B.
--
Win XP PRof SP2 / Vista Ultimate SP2
Office 2003 SP2 /2007 Ultimate SP2
  #4   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 1,045
Default Find date in excel string previous to last occurence of specific text

On Tue, 23 Jul 2013 20:28:41 +0100, osuejm1026 wrote:

I have thousands of strings I need to parse through to extract out a
specific date. The date location has no specific parameters around it
to easily extract it out. The location from the beginning or the end
varies with every string. The only thing that is consistent is a
message that reads "Changed Status to Subtasks Created" which can be
displayed anywhere from 20 characters to 660 characters after the date
we need to extract.


Based on your snippet, there could be more than one date preceding the particular message. In your snippet, there are two that happen to be identical. Do you want to return the date closest to the "message", or the first date in the string that meets the "distance (20-660)" requirement?

Also, is your entire string in a single cell? Or are they in multiple cells? If the latter, how are they split up?

If the entire string is in a single cell, then this User Defined Function will extract the date. If there are multiple dates (as in your snippet), I have included lines to extract either the first date encountered, or the date closest to the message. Just comment out the one you don't want.

To enter this User Defined Function (UDF), <alt-F11 opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this User Defined Function (UDF), enter a formula like

=GetDate(A1) in some cell. Change A1 to reflect the location of the string.

in some cell.

==============================
Option Explicit
Function GetDate(s As String) As Date
Dim re As Object, mc As Object, m As Object
Const sPat As String = "(?:0[1-9]|1[012])[- /.](?:0[1-9]|[12][0-9]|3[01])[- /.](?:19|20)[0-9]{2}" & _
"\s+(?:0?[1-9]|1[12]):[0-5][0-9]\s+[AP]M(?=[\s\S+]{20,660}Changed Status to Subtasks Created)"
Set re = CreateObject("vbscript.regexp")
With re
.Pattern = sPat
.Global = True
.ignorecase = True
End With

Set mc = re.Execute(s)

'Comment out or delete what you don't want he
'to return 1st date-time followed by string
GetDate = mc(0)
'to return last date-time followed by string
GetDate = mc(mc.Count - 1)

End Function
=================================
  #5   Report Post  
Junior Member
 
Posts: 11
Default

thanks for the code. Yes there are multiple dates preceding the message, its essentially an audit trail. So there is exactly one date prior to the message that i need. So I need the closest one to my message not after the message but before the message occurs.

The data is all in a single cell (essentially a database dump for the record into one cell). So your UDT looks cool but I actually need the date preceding not following.



Quote:
Originally Posted by Ron Rosenfeld[_2_] View Post
On Tue, 23 Jul 2013 20:28:41 +0100, osuejm1026 wrote:

I have thousands of strings I need to parse through to extract out a
specific date. The date location has no specific parameters around it
to easily extract it out. The location from the beginning or the end
varies with every string. The only thing that is consistent is a
message that reads "Changed Status to Subtasks Created" which can be
displayed anywhere from 20 characters to 660 characters after the date
we need to extract.


Based on your snippet, there could be more than one date preceding the particular message. In your snippet, there are two that happen to be identical. Do you want to return the date closest to the "message", or the first date in the string that meets the "distance (20-660)" requirement?

Also, is your entire string in a single cell? Or are they in multiple cells? If the latter, how are they split up?

If the entire string is in a single cell, then this User Defined Function will extract the date. If there are multiple dates (as in your snippet), I have included lines to extract either the first date encountered, or the date closest to the message. Just comment out the one you don't want.

To enter this User Defined Function (UDF), <alt-F11 opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this User Defined Function (UDF), enter a formula like

=GetDate(A1) in some cell. Change A1 to reflect the location of the string.

in some cell.

==============================
Option Explicit
Function GetDate(s As String) As Date
Dim re As Object, mc As Object, m As Object
Const sPat As String = "(?:0[1-9]|1[012])[- /.](?:0[1-9]|[12][0-9]|3[01])[- /.](?:19|20)[0-9]{2}" & _
"\s+(?:0?[1-9]|1[12]):[0-5][0-9]\s+[AP]M(?=[\s\S+]{20,660}Changed Status to Subtasks Created)"
Set re = CreateObject("vbscript.regexp")
With re
.Pattern = sPat
.Global = True
.ignorecase = True
End With

Set mc = re.Execute(s)

'Comment out or delete what you don't want he
'to return 1st date-time followed by string
GetDate = mc(0)
'to return last date-time followed by string
GetDate = mc(mc.Count - 1)

End Function
=================================


  #6   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 1,045
Default Find date in excel string previous to last occurence of specific text

On Wed, 24 Jul 2013 01:27:58 +0100, osuejm1026 wrote:

thanks for the code. Yes there are multiple dates preceding the
message, its essentially an audit trail. So there is exactly one date
prior to the message that i need. So I need the closest one to my
message not after the message but before the message occurs.

The data is all in a single cell (essentially a database dump for the
record into one cell). So your UDT looks cool but I actually need the
date preceding not following.


I think I'm not understanding you correctly.

How does the date preceding a string, differ from the date that is followed by the string? (where string=message)

date ..... string Note that the date precedes the string

date .... string Note that the date is followed by the string

I don't see a difference in the sequence.

But tell me what went wrong when you tried my UDF on one of your samples by showing me the relevant data input, what you expected to be returned, and what was actually returned by the UDF.
As written, it should have returned the date closest to the "message" that occurred before the message.
  #7   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 1,045
Default Find date in excel string previous to last occurence of specific text

On Wed, 24 Jul 2013 01:27:58 +0100, osuejm1026 wrote:

thanks for the code. Yes there are multiple dates preceding the
message, its essentially an audit trail. So there is exactly one date
prior to the message that i need. So I need the closest one to my
message not after the message but before the message occurs.


Although you haven't reported back as to what the problem was with my UDF, I'm wondering if there might be more than one date to extract from a single cell.
In other words, can there be, in a single cell a sequence like:

Date1
Date2
....
....
....
Message
....
....
Date3
....
....
Message

Where "Message" = "Changed Status to Subtasks Created"
and, in the above example, you would want to extract both Date2 and Date3?

It is doable, but with a more complicated Regular Expression (regex).
As written, the regex looks for a date that is followed by Message in the range of 20 to 660 characters. It returns all the dates but we only look at the last one returned.
However, if there might be more than one date that needs to be returned, we need to change the regex logic, and also the UDF output logic so it returns an array of dates.
It's easily doable, but I would not add the complexity unless it is required.
  #8   Report Post  
Junior Member
 
Posts: 11
Default

I have entered your UDF and when I type it into a cell I see GetDate appear. I enter in the appropriate cell but all i get returned is #NAME?. I put a watch stamp on the code and the code is not even triggered like it is not even catching.

To answer your later post, I only need to return the 1 date for right now. I have other logic to pull in other dates that i require that is much easier to extract.

So I am not sure what is going on, but I made sure macro security is disabled so all macros work, and i have closed and reopened, but still no dice.

Thoughts?



Quote:
Originally Posted by Ron Rosenfeld[_2_] View Post
On Wed, 24 Jul 2013 01:27:58 +0100, osuejm1026 wrote:

thanks for the code. Yes there are multiple dates preceding the
message, its essentially an audit trail. So there is exactly one date
prior to the message that i need. So I need the closest one to my
message not after the message but before the message occurs.


Although you haven't reported back as to what the problem was with my UDF, I'm wondering if there might be more than one date to extract from a single cell.
In other words, can there be, in a single cell a sequence like:

Date1
Date2
....
....
....
Message
....
....
Date3
....
....
Message

Where "Message" = "Changed Status to Subtasks Created"
and, in the above example, you would want to extract both Date2 and Date3?

It is doable, but with a more complicated Regular Expression (regex).
As written, the regex looks for a date that is followed by Message in the range of 20 to 660 characters. It returns all the dates but we only look at the last one returned.
However, if there might be more than one date that needs to be returned, we need to change the regex logic, and also the UDF output logic so it returns an array of dates.
It's easily doable, but I would not add the complexity unless it is required.
  #9   Report Post  
Junior Member
 
Posts: 11
Default

Nevermind, figured out the error. However, for some reason every 100th row or so displays a #VALUE! So after a couple thousands records its only a handful that I could look up manually but just curious if you knew why it might throw that. But your code is working wonders of the ones I have checked so far.

Quote:
Originally Posted by osuejm1026 View Post
I have entered your UDF and when I type it into a cell I see GetDate appear. I enter in the appropriate cell but all i get returned is #NAME?. I put a watch stamp on the code and the code is not even triggered like it is not even catching.

To answer your later post, I only need to return the 1 date for right now. I have other logic to pull in other dates that i require that is much easier to extract.

So I am not sure what is going on, but I made sure macro security is disabled so all macros work, and i have closed and reopened, but still no dice.

Thoughts?
  #10   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 1,045
Default Find date in excel string previous to last occurence of specific text

On Wed, 24 Jul 2013 14:36:16 +0100, osuejm1026 wrote:


I have entered your UDF and when I type it into a cell I see GetDate
appear. I enter in the appropriate cell but all i get returned is
#NAME?. I put a watch stamp on the code and the code is not even
triggered like it is not even catching.

To answer your later post, I only need to return the 1 date for right
now. I have other logic to pull in other dates that i require that is
much easier to extract.

So I am not sure what is going on, but I made sure macro security is
disabled so all macros work, and i have closed and reopened, but still
no dice.

Thoughts?


Did you enter the UDF code into a regular module using the "exact" method I laid out?
Is the UDF in an open workbook?

The #NAME? error is usually because your workbook is not recognizing the macro. But when you write "I see GetDate appear", it seems it should be -- if what you mean by that is that
when you start typing =get a dropdown appears and one of the items on the dropdown is GetDate.

If all else fails, post a copy of your workbook (sanitized as necessary) with the UDF included, on some public sharing website (e.g. SkyDrive) and post a link here.





  #11   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 1,045
Default Find date in excel string previous to last occurence of specific text

On Wed, 24 Jul 2013 14:36:16 +0100, osuejm1026 wrote:

To answer your later post, I only need to return the 1 date for right
now. I have other logic to pull in other dates that i require that is
much easier to extract.


My question wasn't so much as to whether to extract just one date, but rather if there could be more than one

<Date Time
....
....
<Message

grouping in a single cell.

Also, although the regular expression looks complicated, it is fairly simple and could be adapted to other dates.

(?:0[1-9]|1[012])[- /.](?:0[1-9]|[12][0-9]|3[01])[- /.](?:19|20)[0-9]{2}\s+(?:0?[1-9]|1[12]):[0-5][0-9]\s+[AP]M

is merely a method for recognizing VALID date strings in the format of dd/mm/yyyy (h)h:mm AM or PM. As written it will recognize dates from 1/1/1900 - 12/31/2000 but can be extended if necessary.

The rest of the regex: (?=[\s\S+]{20,660}Changed Status to Subtasks Created)
says "look ahead beetween 20 - 660 characters until you get that message "Changed Status to Subtasks Created"

So if similar logic applies to your other dates, it would be trivial to change the regex to accomodate.
  #12   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 1,045
Default Find date in excel string previous to last occurence of specific text

On Wed, 24 Jul 2013 15:14:21 +0100, osuejm1026 wrote:

Nevermind, figured out the error. However, for some reason every 100th
row or so displays a #VALUE! So after a couple thousands records its
only a handful that I could look up manually but just curious if you
knew why it might throw that. But your code is working wonders of the
ones I have checked so far.


A #VALUE! error most likely means that the sequence being tested for

Date (in the format specified)
....
....
<message

does not exist in that cell.

If you could post an exact copy of a cell that returns #VALUE!, I should be able to locate the issue.
  #13   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 3,872
Default Find date in excel string previous to last occurence of specific text

Hi,

Am Tue, 23 Jul 2013 22:52:16 +0100 schrieb osuejm1026:

I enter in your function and when i type in the function in excel, i get
a #VALUE! returned. I stepped through the code and it seems to be
working, but nothing gets returned. Thoughts?


have a look:
https://skydrive.live.com/#cid=9378A...121822A3%21326
for the workbook "Date"
So in SkyDrive macros are disabled you have to rightclick and download
the workbook.


Regards
Claus B.
--
Win XP PRof SP2 / Vista Ultimate SP2
Office 2003 SP2 /2007 Ultimate SP2
  #14   Report Post  
Junior Member
 
Posts: 11
Default

Quote:
Originally Posted by Ron Rosenfeld[_2_] View Post
On Wed, 24 Jul 2013 14:36:16 +0100, osuejm1026 wrote:


I have entered your UDF and when I type it into a cell I see GetDate
appear. I enter in the appropriate cell but all i get returned is
#NAME?. I put a watch stamp on the code and the code is not even
triggered like it is not even catching.

To answer your later post, I only need to return the 1 date for right
now. I have other logic to pull in other dates that i require that is
much easier to extract.

So I am not sure what is going on, but I made sure macro security is
disabled so all macros work, and i have closed and reopened, but still
no dice.

Thoughts?


Did you enter the UDF code into a regular module using the "exact" method I laid out?
Is the UDF in an open workbook?

The #NAME? error is usually because your workbook is not recognizing the macro. But when you write "I see GetDate appear", it seems it should be -- if what you mean by that is that
when you start typing =get a dropdown appears and one of the items on the dropdown is GetDate.

If all else fails, post a copy of your workbook (sanitized as necessary) with the UDF included, on some public sharing website (e.g. SkyDrive) and post a link here.
Not sure what I did, but all of a sudden it just started working.

Ok so for 5,000 lines the code worked except for 167 of them. Now some are legit as the audit trail never captured the specific string i was looking for. But for 145 of them, the string exists, just returned a #VALUE! instead. Not sure why, looking to see if there is any extra spaces or anything different, I am just not seeing it.

Here is a snippet of an audit trail that failed with your code:
06/01/2013 10:32 AM Escalated Changed UAM Email to Employee Information:
06/01/2013 10:32 AM Escalated Changed AutoClose to No
Added the following Subtasks to this Request: 765640.
06/01/2013 10:45 AM Escalated Escalated: Rename Master Ticket - CU-IU/SLM
Changed Status to Subtasks Created from Checking Approvals
Changed Request Title
Changed Employee Last Name: to Srew
Changed Request data or Contact data
Changed Initial Approval to None
Changed Additional Approval to No
Changed Action Needed to
Changed ProfileID to 765634
06/06/2013 10:38 AM Escalated Changed Status to Closed from Subtasks Created
Master automatically set to Closed as a result of change in subtask .


Here is a snippet of one where the code did work:
05/01/2013 10:54 AM Escalated Escalated: Awaiting Subtask Processing
Changed Status to Generating Subtasks from Request Verified
Escalated: Security Groups Subtask
Changed Status to Awaiting Incident Creation from Generating Subtasks
Changed Request Title
Changed Employee Last Name: to
Changed Security Groups to Currently Active
Changed Request data or Contact data
Changed Initial Approval to None
Changed Additional Approval to No
Changed Action Needed to Remove
Changed ProfileID to
Changed Ticket Item to Security Groups
Changed Item Details to Security Group Detail:
05/01/2013 10:54 AM Escalated Changed UAM Email to Employee Information:
05/01/2013 10:54 AM Escalated Added the following Subtasks to this Request: 761510.
05/01/2013 11:05 AM Escalated Escalated: Rename Master Ticket - CU / DU
Changed Status to Subtasks Created from Awaiting Incident Creation
Changed Request Title
Changed Employee Last Name: to colsky cotolsky
Changed Request data or Contact data
Changed Initial Approval to None
Changed Additional Approval to No
Changed Action Needed to
Changed ProfileID to 761458
05/01/2013 1:38 PM Escalated Changed Status to Closed from Subtasks Created
Master automatically set to Closed as a result of change in subtask .
  #15   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 1,045
Default Find date in excel string previous to last occurence of specific text

On Wed, 24 Jul 2013 16:03:51 +0100, osuejm1026 wrote:

Not sure what I did, but all of a sudden it just started working.

Ok so for 5,000 lines the code worked except for 167 of them. Now some
are legit as the audit trail never captured the specific string i was
looking for. But for 145 of them, the string exists, just returned a
#VALUE! instead. Not sure why, looking to see if there is any extra
spaces or anything different, I am just not seeing it.


Got it. There was an error in the regex in that it would not detect times from 10:00 to 10:59
Easy fix:

==========================
Option Explicit
Function GetDate(s As String) As Date
Dim re As Object, mc As Object, m As Object
Const sPat As String = "(?:0[1-9]|1[012])[- /.](?:0[1-9]|[12][0-9]|3[01])[- /.](?:19|20)[0-9]{2}" & _
"\s+(?:0?[1-9]|1[012]):[0-5][0-9]\s+[AP]M(?=[\s\S+]{20,660}Changed Status to Subtasks Created)"
Set re = CreateObject("vbscript.regexp")
With re
.Pattern = sPat
.Global = True
.ignorecase = True
End With

Set mc = re.Execute(s)
'to return last date-time followed by string
GetDate = mc(mc.Count - 1)

End Function
================================================== ======


  #16   Report Post  
Junior Member
 
Posts: 11
Default

Quote:
Originally Posted by Ron Rosenfeld[_2_] View Post
On Wed, 24 Jul 2013 16:03:51 +0100, osuejm1026 wrote:

Not sure what I did, but all of a sudden it just started working.

Ok so for 5,000 lines the code worked except for 167 of them. Now some
are legit as the audit trail never captured the specific string i was
looking for. But for 145 of them, the string exists, just returned a
#VALUE! instead. Not sure why, looking to see if there is any extra
spaces or anything different, I am just not seeing it.


Got it. There was an error in the regex in that it would not detect times from 10:00 to 10:59
Easy fix:

==========================
Option Explicit
Function GetDate(s As String) As Date
Dim re As Object, mc As Object, m As Object
Const sPat As String = "(?:0[1-9]|1[012])[- /.](?:0[1-9]|[12][0-9]|3[01])[- /.](?:19|20)[0-9]{2}" & _
"\s+(?:0?[1-9]|1[012]):[0-5][0-9]\s+[AP]M(?=[\s\S+]{20,660}Changed Status to Subtasks Created)"
Set re = CreateObject("vbscript.regexp")
With re
.Pattern = sPat
.Global = True
.ignorecase = True
End With

Set mc = re.Execute(s)
'to return last date-time followed by string
GetDate = mc(mc.Count - 1)

End Function
================================================== ======


100% worked. Thanks for helping out. I tried using the same code for another string that works the same way but was not entirely successful but right now have a deadline I have to hit first. I might reply next week with some snippets to see if there is anything that can be altered to make it work.

Thanks again.
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How do I find last occurence of a character in a text string in Ex jten Excel Programming 4 August 24th 09 04:02 PM
How to find the first occurence of any number in a string Caio Milani Excel Programming 2 November 7th 06 08:02 PM
Find first occurence of specific data in a column. Ben Excel Programming 3 May 24th 05 03:53 PM
Find specific text in a string Samuel[_5_] Excel Programming 4 February 22nd 05 10:27 PM
Find specific text in a string Kilcup Excel Programming 5 June 10th 04 04:29 PM


All times are GMT +1. The time now is 03:50 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"