Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
I am trying to parse data but with limited success.
Within an imported file I have data like this: 26/07/2006 MT.:004367 CAPITAL CITY COUNCIL 225200 Ref:200060 29/04/2006 West Country Waterworks Co Inv:7097932959 O/No:2252005207 Trans. ID.:00000505 06/11/2006 Conway Cleaning Inv:StM002L Non Order Trans. ID.:00000619 22/06/2006 E A Gangle & Partners 00000527 Crd. Nt. No.:4823a Trans. ID.:00000524 12/06/2006 Wessex Purchasing O/No.:2252005217 Trans. ID.:00000218 where the date is in Col C and all the rest is in Col D For the purposes of this we'll assume that the 5 rows of data are in rows 1 to 5 (if only it were that simple!) I want to split the text in Col D into Cols E to I whe Col E has the formula =if(Left(D2,3)="MT.",Left(D2,10),"") so it returns MT.:004367 for the first row in this example and "" for the rest and now it gets complicated.... Col F has a formula which returns the supplier so in the examples above it would bring back CAPITAL CITY COUNCIL in row 1 and then West Country Waterworks Co, Conway Cleaning, E A Gangle & Partners and Wessex Purchasing in subsequent rows. So the formula needs to return everything to the left of the first occurence of either Inv or 0000 or O/No unless the string starts with MT. where it returns text from the 12th character in the string until there are multiple spaces Col G has a formula which returns Inv:and the following numbers until there is a string of multiple spaces so rows 1,4 and 5 would be "", row 2 would have Inv:7097932959 and row 3 would have Inv:StM002L Col H has a formula which returns Crd. Nt. No.:and the following numbers until there is a space so rows 1,2,3 and 5 would be "", row 4 would have Crd. Nt. No.:4823a Col I has a formula which returns "Non-Order" if "Non-Order" appears in the string or O/No:and the following numbers until there is a space Col J Returns Trans ID.:and the following digits By using IF and SEARCH I can tell if and where say "Inv:" exists within the text string but I can't see how to use this as the start point or setting the stop point for extracting the following text so I'm really stuck now and would appreciate some guidance. Sorry for such a long post but many thanks in anticipation |
#2
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
Try this out, if you always have the :
=RIGHT(A1,LEN(A1)-FIND(":",A1)) -- -John Northwest11 Please rate when your question is answered to help us and others know what is helpful. "nospaminlich" wrote: I am trying to parse data but with limited success. Within an imported file I have data like this: 26/07/2006 MT.:004367 CAPITAL CITY COUNCIL 225200 Ref:200060 29/04/2006 West Country Waterworks Co Inv:7097932959 O/No:2252005207 Trans. ID.:00000505 06/11/2006 Conway Cleaning Inv:StM002L Non Order Trans. ID.:00000619 22/06/2006 E A Gangle & Partners 00000527 Crd. Nt. No.:4823a Trans. ID.:00000524 12/06/2006 Wessex Purchasing O/No.:2252005217 Trans. ID.:00000218 where the date is in Col C and all the rest is in Col D For the purposes of this we'll assume that the 5 rows of data are in rows 1 to 5 (if only it were that simple!) I want to split the text in Col D into Cols E to I whe Col E has the formula =if(Left(D2,3)="MT.",Left(D2,10),"") so it returns MT.:004367 for the first row in this example and "" for the rest and now it gets complicated.... Col F has a formula which returns the supplier so in the examples above it would bring back CAPITAL CITY COUNCIL in row 1 and then West Country Waterworks Co, Conway Cleaning, E A Gangle & Partners and Wessex Purchasing in subsequent rows. So the formula needs to return everything to the left of the first occurence of either Inv or 0000 or O/No unless the string starts with MT. where it returns text from the 12th character in the string until there are multiple spaces Col G has a formula which returns Inv:and the following numbers until there is a string of multiple spaces so rows 1,4 and 5 would be "", row 2 would have Inv:7097932959 and row 3 would have Inv:StM002L Col H has a formula which returns Crd. Nt. No.:and the following numbers until there is a space so rows 1,2,3 and 5 would be "", row 4 would have Crd. Nt. No.:4823a Col I has a formula which returns "Non-Order" if "Non-Order" appears in the string or O/No:and the following numbers until there is a space Col J Returns Trans ID.:and the following digits By using IF and SEARCH I can tell if and where say "Inv:" exists within the text string but I can't see how to use this as the start point or setting the stop point for extracting the following text so I'm really stuck now and would appreciate some guidance. Sorry for such a long post but many thanks in anticipation |
#3
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
Any way you cut it, there ain't no easy answer to this sort of thing because
of all the unknown variables.........but you might try something like this. 1- go ahead and use your original formula to populate column E 2- in column F, maybe something like this, to get rid of the MT thing off the lead of some of the cells and have each start with a name...... =if(left(d2,3="MT.",mid(d2,12,99),d2) 3- and maybe something like this to pick up the end =mid(f2,find(":",f2,1)-4,99) hth Vaya con Dios, Chuck, CABGx3 "nospaminlich" wrote: I am trying to parse data but with limited success. Within an imported file I have data like this: 26/07/2006 MT.:004367 CAPITAL CITY COUNCIL 225200 Ref:200060 29/04/2006 West Country Waterworks Co Inv:7097932959 O/No:2252005207 Trans. ID.:00000505 06/11/2006 Conway Cleaning Inv:StM002L Non Order Trans. ID.:00000619 22/06/2006 E A Gangle & Partners 00000527 Crd. Nt. No.:4823a Trans. ID.:00000524 12/06/2006 Wessex Purchasing O/No.:2252005217 Trans. ID.:00000218 where the date is in Col C and all the rest is in Col D For the purposes of this we'll assume that the 5 rows of data are in rows 1 to 5 (if only it were that simple!) I want to split the text in Col D into Cols E to I whe Col E has the formula =if(Left(D2,3)="MT.",Left(D2,10),"") so it returns MT.:004367 for the first row in this example and "" for the rest and now it gets complicated.... Col F has a formula which returns the supplier so in the examples above it would bring back CAPITAL CITY COUNCIL in row 1 and then West Country Waterworks Co, Conway Cleaning, E A Gangle & Partners and Wessex Purchasing in subsequent rows. So the formula needs to return everything to the left of the first occurence of either Inv or 0000 or O/No unless the string starts with MT. where it returns text from the 12th character in the string until there are multiple spaces Col G has a formula which returns Inv:and the following numbers until there is a string of multiple spaces so rows 1,4 and 5 would be "", row 2 would have Inv:7097932959 and row 3 would have Inv:StM002L Col H has a formula which returns Crd. Nt. No.:and the following numbers until there is a space so rows 1,2,3 and 5 would be "", row 4 would have Crd. Nt. No.:4823a Col I has a formula which returns "Non-Order" if "Non-Order" appears in the string or O/No:and the following numbers until there is a space Col J Returns Trans ID.:and the following digits By using IF and SEARCH I can tell if and where say "Inv:" exists within the text string but I can't see how to use this as the start point or setting the stop point for extracting the following text so I'm really stuck now and would appreciate some guidance. Sorry for such a long post but many thanks in anticipation |
#4
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
Sorry its early for me, this is better, colon could be in non- invoice lines,
this finds all of inv: =RIGHT(A1,(LEN(A1)-SEARCH("inv:",A1,1))-3) -- -John Northwest11 Please rate when your question is answered to help us and others know what is helpful. "nospaminlich" wrote: I am trying to parse data but with limited success. Within an imported file I have data like this: 26/07/2006 MT.:004367 CAPITAL CITY COUNCIL 225200 Ref:200060 29/04/2006 West Country Waterworks Co Inv:7097932959 O/No:2252005207 Trans. ID.:00000505 06/11/2006 Conway Cleaning Inv:StM002L Non Order Trans. ID.:00000619 22/06/2006 E A Gangle & Partners 00000527 Crd. Nt. No.:4823a Trans. ID.:00000524 12/06/2006 Wessex Purchasing O/No.:2252005217 Trans. ID.:00000218 where the date is in Col C and all the rest is in Col D For the purposes of this we'll assume that the 5 rows of data are in rows 1 to 5 (if only it were that simple!) I want to split the text in Col D into Cols E to I whe Col E has the formula =if(Left(D2,3)="MT.",Left(D2,10),"") so it returns MT.:004367 for the first row in this example and "" for the rest and now it gets complicated.... Col F has a formula which returns the supplier so in the examples above it would bring back CAPITAL CITY COUNCIL in row 1 and then West Country Waterworks Co, Conway Cleaning, E A Gangle & Partners and Wessex Purchasing in subsequent rows. So the formula needs to return everything to the left of the first occurence of either Inv or 0000 or O/No unless the string starts with MT. where it returns text from the 12th character in the string until there are multiple spaces Col G has a formula which returns Inv:and the following numbers until there is a string of multiple spaces so rows 1,4 and 5 would be "", row 2 would have Inv:7097932959 and row 3 would have Inv:StM002L Col H has a formula which returns Crd. Nt. No.:and the following numbers until there is a space so rows 1,2,3 and 5 would be "", row 4 would have Crd. Nt. No.:4823a Col I has a formula which returns "Non-Order" if "Non-Order" appears in the string or O/No:and the following numbers until there is a space Col J Returns Trans ID.:and the following digits By using IF and SEARCH I can tell if and where say "Inv:" exists within the text string but I can't see how to use this as the start point or setting the stop point for extracting the following text so I'm really stuck now and would appreciate some guidance. Sorry for such a long post but many thanks in anticipation |
#5
![]()
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
![]()
nospaminlich wrote...
.... Within an imported file I have data like this: 26/07/2006 MT.:004367 CAPITAL CITY COUNCIL 225200 Ref:200060 29/04/2006 West Country Waterworks Co Inv:7097932959 O/No:2252005207 Trans. ID.:00000505 06/11/2006 Conway Cleaning Inv:StM002L Non Order Trans. ID.:00000619 22/06/2006 E A Gangle & Partners 00000527 Crd. Nt. No.:4823a Trans. ID.:00000524 12/06/2006 Wessex Purchasing O/No.:2252005217 Trans. ID.:00000218 where the date is in Col C and all the rest is in Col D For the purposes of this we'll assume that the 5 rows of data are in rows 1 to 5 (if only it were that simple!) I want to split the text in Col D into Cols E to I whe Col E has the formula =if(Left(D2,3)="MT.",Left(D2,10),"") so it returns MT.:004367 for the first row in this example and "" for the rest Note: looks like your first record is in row 2 rather than row 1. Regular expressions would be the best tool for this. If you download and install Laurent Longre's MOREFUNC.XLL add-in, freely available at http://xcell05.free.fr/english/ you could get the same result with the formula =REGEX.MID(D2,"MT\.:\S+") and now it gets complicated.... Col F has a formula which returns the supplier so in the examples above it would bring back CAPITAL CITY COUNCIL in row 1 and then West Country Waterworks Co, Conway Cleaning, E A Gangle & Partners and Wessex Purchasing in subsequent rows. So the formula needs to return everything to the left of the first occurence of either Inv or 0000 or O/No unless the string starts with MT. where it returns text from the 12th character in the string until there are multiple spaces =REGEX.MID(D2,"[A-Za-z][^ \t:]*(\s[^ \t:]+)*?(?=\s(\s|[^ \t:]+:| \d{2}).*$)") Col G has a formula which returns Inv:and the following numbers until there is a string of multiple spaces so rows 1,4 and 5 would be "", row 2 would have Inv:7097932959 and row 3 would have Inv:StM002L =REGEX.MID(D2,"Inv:\S+") Col H has a formula which returns Crd. Nt. No.:and the following numbers until there is a space so rows 1,2,3 and 5 would be "", row 4 would have Crd. Nt. No.:4823a =REGEX.MID(D2,"Crd\. Nt\. No\.:\S+") Col I has a formula which returns "Non-Order" if "Non-Order" appears in the string or O/No:and the following numbers until there is a space =IF(COUNTIF(D2,"*Non Order*"),"Non-Order",REGEX.MID(D2,"O/No:\S+")) Col J Returns Trans ID.:and the following digits .... =REGEX.MID(D2,"Trans\. ID\.:\S+") This leaves unspecified what to do with Ref:# and substrings of numerals not preceded by a tag and colon, e.g., 225200 in the first record. |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Extracting a word from a text string | Excel Discussion (Misc queries) | |||
Finding a text string w/in a Cell | Excel Discussion (Misc queries) | |||
How do I look up a number within a string of text | Excel Worksheet Functions | |||
can you find specific text in a string ignoring any other text | Excel Discussion (Misc queries) | |||
Extracting Surname from within a text string | Excel Worksheet Functions |