View Single Post
  #5   Report Post  
Posted to microsoft.public.excel.misc
Ron Rosenfeld Ron Rosenfeld is offline
external usenet poster
 
Posts: 5,651
Default Parse delimiter is change from text to numbers

On Tue, 23 Dec 2008 12:58:31 -0800, ORLANDO V
wrote:

I have a long list of items like this one below.
They consist of four things: Asset name, country, units, and market value.

Accor SA France 3,300,000 177,214,917

I want to parse this? Any ideas on the best way to do this? Open to all
suggestions. Prefer formulas over VBA for this one .


The problem is parsing out the country name, since there is no good divider
between the Asset Name and the Country, and it is theoretically possible for an
Asset Name to contain a Country Name. It is also possible for a Country to have
multiple words in the name (e.g. Great Britain; United States; etc).

For Units, you could use:

=--TRIM(LEFT(RIGHT(SUBSTITUTE(A2," ",REPT(" ",99)),198),99))

and for Market Value:

=--TRIM(RIGHT(SUBSTITUTE(A2," ",REPT(" ",99)),99))

The preceding double unary converts the values into numbers; otherwise they
will be returned as Text.

Because of the problem with parsing out the Country Name, the approach I would
use would be to have a List of Countries, and extract the name that is followed
by the Units and Market Value numbers. This can be done easily using Regular
Expressions, but would require a User Defined Function (meaning VBA).

And if you are doing that, you might as well use the same approach for the
other portions, too.

So, somewhere set up a List of Countries.

You can then use this UDF:

To enter the UDF, <alt-F11 opens the VB Editor. Ensure your project is
highlighted in the project explorer window, then Insert/Module and paste the
code below into the window that opens.

To use this, enter a formula in the following format:

=parsespecial(cell_ref, item, country_list)

Item can be "Asset", "Country", "Units", or "Market"

country_list is a range reference where you have stored a list of valid country
names.

==========================================
Option Explicit
Function ParseSpecial(str As String, sItem As String, _
rCountryList As Range) As Variant
Dim re As Object, mc As Object
Dim sPattern As String
Dim CountryList As String
Dim c As Range

For Each c In rCountryList
CountryList = CountryList & c.Text & "|"
Next c
CountryList = "(" & Left(CountryList, Len(CountryList) - 1) & ")"

Select Case Left(UCase(sItem), 5)
Case Is = "COUNT"
sPattern = CountryList & "(?=(\s+[\d.,]+){2}\s*$)"
Case Is = "ASSET"
sPattern = "^.*(?=\s+" & CountryList & "(?=(\s+[\d.,]+){2}\s*$))"
Case Is = "UNITS"
sPattern = "[\d,]+(?=\s+[\d,]+\s*$)"
Case Is = "MARKE"
sPattern = "[\d,]+(?=\s*$)"
Case Else
MsgBox ("sItem must be 'Country', 'Asset', 'Units' or 'Market Value'")
ParseSpecial = CVErr(xlErrNA)
Exit Function
End Select

Set re = CreateObject("vbscript.regexp")
re.Pattern = sPattern
If re.test(str) = True Then
Set mc = re.Execute(str)
ParseSpecial = mc(0).Value
End If
End Function
=====================================



--ron