Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 68
Default Extracting Info From Within A Text String

I am trying to parse data but with limited success.

Within an imported file I have data like this:


26/07/2006 MT.:004367 CAPITAL CITY COUNCIL 225200
Ref:200060


29/04/2006 West Country Waterworks Co Inv:7097932959 O/No:2252005207
Trans. ID.:00000505


06/11/2006 Conway Cleaning Inv:StM002L Non Order Trans. ID.:00000619

22/06/2006 E A Gangle & Partners 00000527 Crd. Nt. No.:4823a
Trans. ID.:00000524


12/06/2006 Wessex Purchasing O/No.:2252005217 Trans. ID.:00000218



where the date is in Col C and all the rest is in Col D

For the purposes of this we'll assume that the 5 rows of data are in rows 1
to 5 (if only it were that simple!)

I want to split the text in Col D into Cols E to I whe

Col E has the formula =if(Left(D2,3)="MT.",Left(D2,10),"") so it returns
MT.:004367 for the first row in this example and "" for the rest

and now it gets complicated....

Col F has a formula which returns the supplier so in the examples above it
would bring back CAPITAL CITY COUNCIL in row 1 and then West Country
Waterworks Co, Conway Cleaning, E A Gangle & Partners and Wessex Purchasing
in subsequent rows. So the formula needs to return everything to the left of
the first occurence of either Inv or 0000 or O/No unless the string starts
with MT. where it returns text from the 12th character in the string until
there are multiple spaces

Col G has a formula which returns Inv:and the following numbers until there
is a string of multiple spaces so rows 1,4 and 5 would be "", row 2 would
have Inv:7097932959 and row 3 would have Inv:StM002L

Col H has a formula which returns Crd. Nt. No.:and the following numbers
until there is a space so rows 1,2,3 and 5 would be "", row 4 would have Crd.
Nt. No.:4823a

Col I has a formula which returns "Non-Order" if "Non-Order" appears in the
string or O/No:and the following numbers until there is a space

Col J Returns Trans ID.:and the following digits

By using IF and SEARCH I can tell if and where say "Inv:" exists within the
text string but I can't see how to use this as the start point or setting the
stop point for extracting the following text so I'm really stuck now and
would appreciate some guidance.

Sorry for such a long post but many thanks in anticipation

  #2   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 772
Default Extracting Info From Within A Text String

Try this out, if you always have the :
=RIGHT(A1,LEN(A1)-FIND(":",A1))
--
-John Northwest11
Please rate when your question is answered to help us and others know what
is helpful.


"nospaminlich" wrote:

I am trying to parse data but with limited success.

Within an imported file I have data like this:


26/07/2006 MT.:004367 CAPITAL CITY COUNCIL 225200
Ref:200060


29/04/2006 West Country Waterworks Co Inv:7097932959 O/No:2252005207
Trans. ID.:00000505


06/11/2006 Conway Cleaning Inv:StM002L Non Order Trans. ID.:00000619

22/06/2006 E A Gangle & Partners 00000527 Crd. Nt. No.:4823a
Trans. ID.:00000524


12/06/2006 Wessex Purchasing O/No.:2252005217 Trans. ID.:00000218



where the date is in Col C and all the rest is in Col D

For the purposes of this we'll assume that the 5 rows of data are in rows 1
to 5 (if only it were that simple!)

I want to split the text in Col D into Cols E to I whe

Col E has the formula =if(Left(D2,3)="MT.",Left(D2,10),"") so it returns
MT.:004367 for the first row in this example and "" for the rest

and now it gets complicated....

Col F has a formula which returns the supplier so in the examples above it
would bring back CAPITAL CITY COUNCIL in row 1 and then West Country
Waterworks Co, Conway Cleaning, E A Gangle & Partners and Wessex Purchasing
in subsequent rows. So the formula needs to return everything to the left of
the first occurence of either Inv or 0000 or O/No unless the string starts
with MT. where it returns text from the 12th character in the string until
there are multiple spaces

Col G has a formula which returns Inv:and the following numbers until there
is a string of multiple spaces so rows 1,4 and 5 would be "", row 2 would
have Inv:7097932959 and row 3 would have Inv:StM002L

Col H has a formula which returns Crd. Nt. No.:and the following numbers
until there is a space so rows 1,2,3 and 5 would be "", row 4 would have Crd.
Nt. No.:4823a

Col I has a formula which returns "Non-Order" if "Non-Order" appears in the
string or O/No:and the following numbers until there is a space

Col J Returns Trans ID.:and the following digits

By using IF and SEARCH I can tell if and where say "Inv:" exists within the
text string but I can't see how to use this as the start point or setting the
stop point for extracting the following text so I'm really stuck now and
would appreciate some guidance.

Sorry for such a long post but many thanks in anticipation

  #3   Report Post  
Posted to microsoft.public.excel.worksheet.functions
CLR CLR is offline
external usenet poster
 
Posts: 1,998
Default Extracting Info From Within A Text String

Any way you cut it, there ain't no easy answer to this sort of thing because
of all the unknown variables.........but you might try something like this.
1- go ahead and use your original formula to populate column E
2- in column F, maybe something like this, to get rid of the MT thing off
the lead of some of the cells and have each start with a name......
=if(left(d2,3="MT.",mid(d2,12,99),d2)
3- and maybe something like this to pick up the end
=mid(f2,find(":",f2,1)-4,99)

hth
Vaya con Dios,
Chuck, CABGx3





"nospaminlich" wrote:

I am trying to parse data but with limited success.

Within an imported file I have data like this:


26/07/2006 MT.:004367 CAPITAL CITY COUNCIL 225200
Ref:200060


29/04/2006 West Country Waterworks Co Inv:7097932959 O/No:2252005207
Trans. ID.:00000505


06/11/2006 Conway Cleaning Inv:StM002L Non Order Trans. ID.:00000619

22/06/2006 E A Gangle & Partners 00000527 Crd. Nt. No.:4823a
Trans. ID.:00000524


12/06/2006 Wessex Purchasing O/No.:2252005217 Trans. ID.:00000218



where the date is in Col C and all the rest is in Col D

For the purposes of this we'll assume that the 5 rows of data are in rows 1
to 5 (if only it were that simple!)

I want to split the text in Col D into Cols E to I whe

Col E has the formula =if(Left(D2,3)="MT.",Left(D2,10),"") so it returns
MT.:004367 for the first row in this example and "" for the rest

and now it gets complicated....

Col F has a formula which returns the supplier so in the examples above it
would bring back CAPITAL CITY COUNCIL in row 1 and then West Country
Waterworks Co, Conway Cleaning, E A Gangle & Partners and Wessex Purchasing
in subsequent rows. So the formula needs to return everything to the left of
the first occurence of either Inv or 0000 or O/No unless the string starts
with MT. where it returns text from the 12th character in the string until
there are multiple spaces

Col G has a formula which returns Inv:and the following numbers until there
is a string of multiple spaces so rows 1,4 and 5 would be "", row 2 would
have Inv:7097932959 and row 3 would have Inv:StM002L

Col H has a formula which returns Crd. Nt. No.:and the following numbers
until there is a space so rows 1,2,3 and 5 would be "", row 4 would have Crd.
Nt. No.:4823a

Col I has a formula which returns "Non-Order" if "Non-Order" appears in the
string or O/No:and the following numbers until there is a space

Col J Returns Trans ID.:and the following digits

By using IF and SEARCH I can tell if and where say "Inv:" exists within the
text string but I can't see how to use this as the start point or setting the
stop point for extracting the following text so I'm really stuck now and
would appreciate some guidance.

Sorry for such a long post but many thanks in anticipation

  #4   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 772
Default Extracting Info From Within A Text String

Sorry its early for me, this is better, colon could be in non- invoice lines,
this finds all of inv:
=RIGHT(A1,(LEN(A1)-SEARCH("inv:",A1,1))-3)
--
-John Northwest11
Please rate when your question is answered to help us and others know what
is helpful.


"nospaminlich" wrote:

I am trying to parse data but with limited success.

Within an imported file I have data like this:


26/07/2006 MT.:004367 CAPITAL CITY COUNCIL 225200
Ref:200060


29/04/2006 West Country Waterworks Co Inv:7097932959 O/No:2252005207
Trans. ID.:00000505


06/11/2006 Conway Cleaning Inv:StM002L Non Order Trans. ID.:00000619

22/06/2006 E A Gangle & Partners 00000527 Crd. Nt. No.:4823a
Trans. ID.:00000524


12/06/2006 Wessex Purchasing O/No.:2252005217 Trans. ID.:00000218



where the date is in Col C and all the rest is in Col D

For the purposes of this we'll assume that the 5 rows of data are in rows 1
to 5 (if only it were that simple!)

I want to split the text in Col D into Cols E to I whe

Col E has the formula =if(Left(D2,3)="MT.",Left(D2,10),"") so it returns
MT.:004367 for the first row in this example and "" for the rest

and now it gets complicated....

Col F has a formula which returns the supplier so in the examples above it
would bring back CAPITAL CITY COUNCIL in row 1 and then West Country
Waterworks Co, Conway Cleaning, E A Gangle & Partners and Wessex Purchasing
in subsequent rows. So the formula needs to return everything to the left of
the first occurence of either Inv or 0000 or O/No unless the string starts
with MT. where it returns text from the 12th character in the string until
there are multiple spaces

Col G has a formula which returns Inv:and the following numbers until there
is a string of multiple spaces so rows 1,4 and 5 would be "", row 2 would
have Inv:7097932959 and row 3 would have Inv:StM002L

Col H has a formula which returns Crd. Nt. No.:and the following numbers
until there is a space so rows 1,2,3 and 5 would be "", row 4 would have Crd.
Nt. No.:4823a

Col I has a formula which returns "Non-Order" if "Non-Order" appears in the
string or O/No:and the following numbers until there is a space

Col J Returns Trans ID.:and the following digits

By using IF and SEARCH I can tell if and where say "Inv:" exists within the
text string but I can't see how to use this as the start point or setting the
stop point for extracting the following text so I'm really stuck now and
would appreciate some guidance.

Sorry for such a long post but many thanks in anticipation

  #5   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 733
Default Extracting Info From Within A Text String

nospaminlich wrote...
....
Within an imported file I have data like this:

26/07/2006 MT.:004367 CAPITAL CITY COUNCIL 225200
Ref:200060

29/04/2006 West Country Waterworks Co Inv:7097932959 O/No:2252005207
Trans. ID.:00000505

06/11/2006 Conway Cleaning Inv:StM002L Non Order Trans. ID.:00000619

22/06/2006 E A Gangle & Partners 00000527 Crd. Nt. No.:4823a
Trans. ID.:00000524

12/06/2006 Wessex Purchasing O/No.:2252005217 Trans. ID.:00000218
where the date is in Col C and all the rest is in Col D

For the purposes of this we'll assume that the 5 rows of data are in rows 1
to 5 (if only it were that simple!)

I want to split the text in Col D into Cols E to I whe

Col E has the formula =if(Left(D2,3)="MT.",Left(D2,10),"") so it returns
MT.:004367 for the first row in this example and "" for the rest


Note: looks like your first record is in row 2 rather than row 1.

Regular expressions would be the best tool for this. If you download
and install Laurent Longre's MOREFUNC.XLL add-in, freely available at

http://xcell05.free.fr/english/

you could get the same result with the formula

=REGEX.MID(D2,"MT\.:\S+")

and now it gets complicated....

Col F has a formula which returns the supplier so in the examples above it
would bring back CAPITAL CITY COUNCIL in row 1 and then West Country
Waterworks Co, Conway Cleaning, E A Gangle & Partners and Wessex Purchasing
in subsequent rows. So the formula needs to return everything to the left of
the first occurence of either Inv or 0000 or O/No unless the string starts
with MT. where it returns text from the 12th character in the string until
there are multiple spaces


=REGEX.MID(D2,"[A-Za-z][^ \t:]*(\s[^ \t:]+)*?(?=\s(\s|[^ \t:]+:|
\d{2}).*$)")

Col G has a formula which returns Inv:and the following numbers until there
is a string of multiple spaces so rows 1,4 and 5 would be "", row 2 would
have Inv:7097932959 and row 3 would have Inv:StM002L


=REGEX.MID(D2,"Inv:\S+")

Col H has a formula which returns Crd. Nt. No.:and the following numbers
until there is a space so rows 1,2,3 and 5 would be "", row 4 would have Crd.
Nt. No.:4823a


=REGEX.MID(D2,"Crd\. Nt\. No\.:\S+")

Col I has a formula which returns "Non-Order" if "Non-Order" appears in the
string or O/No:and the following numbers until there is a space


=IF(COUNTIF(D2,"*Non Order*"),"Non-Order",REGEX.MID(D2,"O/No:\S+"))

Col J Returns Trans ID.:and the following digits

....

=REGEX.MID(D2,"Trans\. ID\.:\S+")

This leaves unspecified what to do with Ref:# and substrings of
numerals not preceded by a tag and colon, e.g., 225200 in the first
record.

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Extracting a word from a text string Nadeem Excel Discussion (Misc queries) 3 October 12th 06 09:17 AM
Finding a text string w/in a Cell ricxl Excel Discussion (Misc queries) 12 March 20th 06 03:47 AM
How do I look up a number within a string of text Rich Hayes Excel Worksheet Functions 3 October 14th 05 05:49 PM
can you find specific text in a string ignoring any other text chriscp Excel Discussion (Misc queries) 1 September 18th 05 09:54 PM
Extracting Surname from within a text string Iainkerr01 Excel Worksheet Functions 6 March 14th 05 10:16 AM


All times are GMT +1. The time now is 05:32 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"