View Single Post
  #9   Report Post  
Posted to microsoft.public.excel.misc
R.Douthwaite
 
Posts: n/a
Default A Difficult Unconcatinate Problem


--
R.Douthwaite.
Information Coordinator
www.swift-research.co.uk


"Ron Rosenfeld" wrote:

On Tue, 24 Jan 2006 11:27:51 -0800, R.Douthwaite
wrote:

"Ron Rosenfeld" wrote:

On Tue, 13 Sep 2005 05:57:13 -0700, "RestlessAde"
wrote:

Hi,

I have tried to solve this problem using a mixture of text functions such as
RIGHT, LEN, LEFT, FIND etc. However, am still stuck. Would appreciate any
suggestions.

I'm trying to split out the following data stored in a single column into
three columns: Country, Data Value and Source.

Angola 13,294,000 Source: ibid.
Anguilla 13,000 Source: ibid.
Antigua and Barbuda 76,000 Source: ibid.
Argentina 37,880,000 Source: ibid.
Armenia 3,206,000 Source: ibid.
Aruba 94,000 Source: ibid.
Australia 20,125,000 Source: ibid.

The problem I'm having relates to the fact that some countries contain more
than one word, so I can't just search for the first " ". I think the answer
is to somehow detect the first instance of a numeric value, but I have no
idea how to do this.

Thanks,
RA

Array formulas will do this. The first two are array formulas; the last is
not. To enter an array formula, be sure to hold down <ctrl<shift while
hitting <enter. Excel will place braces {...} around the formula.

Assuming data is in A2.

Country (array formula):

=LEFT(A2,-1+MATCH(FALSE,ISERROR(-MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)),0))

Data Value (array formula):

=MID(A2,MATCH(FALSE,ISERROR(-MID(A2,ROW(
INDIRECT("1:"&LEN(A2))),1)),0),-1+FIND("Source",A2)
-MATCH(FALSE,ISERROR(-MID(A2,ROW(
INDIRECT("1:"&LEN(A2))),1)),0))

Source (NOT an array formula)

=MID(A3,FIND("Source: ",A3)+8,255)


--ron



Ron (or anyone else for that matter),

The first array formula cited above has solved a very major headach I've
been working on all day as it can be modified to give the position of the
first number in a mixed string of numbers and letters and for that you
deserve major kudos!

However I hate not knowing how it is achieved! Can you explain in plain
english how this is working?

I guess this may help me get my head around a few other thorny issues I have
pending and it may just help out a few others here too

Thanks in advance

R.Douthwaite




=LEFT(A2,-1+MATCH(FALSE,ISERROR(-MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)),0))

If you separate out the various nestings:

MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)

This segment returns an array of each character in A2. (The ROW(INDIRECT(...)
returns an array of "1:n" where n is the number of characters in A2.

The next step is to put a minus sign <- in front of each character:

-MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)

If the character is NOT a number, the array will return an error; otherwise it
will return a number.

We then Test each character position to see if there is an error:

ISERROR(-MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1))

This returns an array of {TRUE,TRUE,FALSE...} depending on whether there is a
digit.

The MATCH function with the 0 match_type argument then finds the first FALSE
which would be the location of the first digit.

================================

Since September, however, I've become enamored of "regular expressions". One
way of using these effectively in Excel is to download and install Longre's
morefunc.xll add-in from http://xcell05.free.fr/

Then, to find the position of the first number in the above, one could use the
much simpler formula:

=REGEX.FIND(A1,"\d")

To extract the first number, where it might contain commas:

=REGEX.MID(A1,"(\d+|,)+")

Depending on your data, these can be much more flexible.


--ron


Ron,

Thank you for the excellent explaination. I think you just opened up a whole
new world to me... Not sure that's necessarily a good thing :-)

R.Douthwaite