View Single Post
  #10   Report Post  
Posted to microsoft.public.excel.worksheet.functions
frosterrj
 
Posts: n/a
Default String parsing with variable lenght strings

Nope, my data is just one type per cell:
9.5 OZ
30#
#10 TIN
1 GAL
38#
the basic regex.mid works.

Thanks for the pointers to the morefunc.xls. Got it an am using now.

Robert

"Harlan Grove" wrote:

Ron Rosenfeld wrote...
....
Having read that some of your data may optionally include decimal numbers, I
would change my previous recommendation to:

Number:

=REGEX.MID(A1,"[0-9]+(\.[0-9]+)?")

This assumes that all decimal digits have a number before the decimal point.
If that is not the case, it can be easily modified.

Units:

=REGEX.MID(A1,"[^0-9. ]+")

This assumes there are not "dots" in the units, as in your examples. Again, if
there are, the expression can be modified to include them.


It gets more difficult if the numbers could include 0. and .0 and if
the units could include periods and numerals, e.g., 20 reams A4, 5
boxes .22 shells.

With just built-in formulas referencing the original record in A1 plus
the name seq as I defined it earlier,

A3:
=LEFT(A1,LOOKUP(2,1/ISNUMBER(-MID(A1,1,seq)),seq))

B3:
=MID($A1,SUMPRODUCT(LEN($A3:A3))+COLUMNS($A3:B3)/2,
FIND(",",$A1&",",SUMPRODUCT(LEN($A3:A3))+COLUMNS($ A3:B3)/2)
-SUMPRODUCT(LEN($A3:A3))-COLUMNS($A3:B3)/2)

C3:
=MID($A1,SUMPRODUCT(LEN($A3:B3))+1+COLUMNS($A3:B3)/2,
LOOKUP(2,1/ISNUMBER(-MID($A1,SUMPRODUCT(LEN($A3:B3))+1
+COLUMNS($A3:B3)/2,seq)),seq))

Copy B3 and paste into D3, then copy C3:D3 and paste into E3:F3, G3:H3,
etc. Note that the number portions are text in the formulas above.

With MOREFUNC.XLL, divide and conquer.

A3:
=--REGEX.MID(REGEX.MID($A1,"[^,]+",(COLUMNS($A3:A3)+1)/2),"\d+(\.\d*)?|\.\d+")

B3:
=REGEX.SUBSTITUTE(REGEX.MID($A1,"[^,]+",COLUMNS($A3:B3)/2),
" *"&REPLACE(A3,2,0,IF(LEFT(A3,1)="0","?",""))&" *","")

Copy A3:B3 and paste into C3:D3, E3:F3, etc.

Note: I'm assuming the OP's records are in one cell given the OP's
statement: "one cell could be 5#, 10Tins, 4 lb, 100Pcs, etc." I'm
interpretting the 'etc.' to mean there could be more fields in the
record, not that each of these be in a separate column.