Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 63
Default Data Parsing Problem

I need to parse text data copied onto an Excel worksheet from a big table in
a PDF file. Each line of text contains a stock's ticker symbol, CUSIP
number, company name, number of shares, and other data, in that order. The
problem is that the company name can be anything from one word to five words
separated by a space, and that messes up the parsing into columns. Using the
Data/Text-to-Columns won't work because company name gets split up into
multiple columns so that the number of shares falls in the wrong column. To
illustrate, here's a typical situation, with three lines of text:

XOM 30231G102 EXXON MOBIL CORP 622,900.00 63.5400 39,579,066.00 4.17%
PFE 717081103 PFIZER INC 1,475,900.00 24.9700 36,853,223.00 3.88%
AEP 025537101 AMERICAN ELEC PWR INC 377,300.00 39.7000 14,978,810.00 1.58%

Using VBA code, how can I parse these rows into columns correctly? Thanks
for your help.


  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 27,285
Default Data Parsing Problem

This worked with your sample date:

Option Explicit
Sub fixColumns()
Dim rng As Range
Dim rng1 As Range
Dim rng2 As Range
Dim cell As Range, sStr As String
Range("A1").CurrentRegion.Columns(1).TextToColumns _
Destination:=Range("A1"), DataType:=xlDelimited, _
TextQualifier:=xlDoubleQuote, _
ConsecutiveDelimiter:=True, Tab:=True, _
Semicolon:=False, Comma:=False, Space:=True, _
Other:=False, FieldInfo:=Array( _
Array(1, 1), Array(2, 1), Array(3, 1), _
Array(4, 1), Array(5, 1), Array(6, 1), _
Array(7, 1), Array(8, 1), Array(9, 1), _
Array(10, 1)), TrailingMinusNumbers:=True
Rows(1).Insert
Range("A1:H1").Value = Array("H1", "H2", _
"H3", "H4", "H5", "H6", "H7", "H8")
ActiveSheet.AutoFilterMode = False
Set rng = Intersect(ActiveSheet.Range("A1") _
.CurrentRegion.EntireRow, Columns(8))
rng.AutoFilter Field:=1, Criteria1:="<"
Set rng2 = rng.Offset(1, -5).Resize(rng.Rows.Count - 1, 1)
On Error Resume Next
Set rng1 = rng2.SpecialCells(xlVisible)
On Error GoTo 0
Do While Not rng1 Is Nothing
For Each cell In rng1
sStr = cell.Value & " " & cell.Offset(0, 1).Value
cell.Value = sStr
cell.Offset(0, 1).Delete Shift:=xlShiftToLeft
Next
On Error Resume Next
Set rng1 = Nothing
ActiveSheet.AutoFilterMode = False
rng.AutoFilter Field:=1, Criteria1:="<"
Set rng1 = rng2.SpecialCells(xlVisible)
On Error GoTo 0
Debug.Print rng.Address, rng2.Address
Loop
Rows(1).Delete
MsgBox "No data in column 8 - done"

End Sub

--
Regards,
Tom Ogilvy

"Stratuser" wrote in message
...
I need to parse text data copied onto an Excel worksheet from a big table

in
a PDF file. Each line of text contains a stock's ticker symbol, CUSIP
number, company name, number of shares, and other data, in that order.

The
problem is that the company name can be anything from one word to five

words
separated by a space, and that messes up the parsing into columns. Using

the
Data/Text-to-Columns won't work because company name gets split up into
multiple columns so that the number of shares falls in the wrong column.

To
illustrate, here's a typical situation, with three lines of text:

XOM 30231G102 EXXON MOBIL CORP 622,900.00 63.5400 39,579,066.00 4.17%
PFE 717081103 PFIZER INC 1,475,900.00 24.9700 36,853,223.00 3.88%
AEP 025537101 AMERICAN ELEC PWR INC 377,300.00 39.7000 14,978,810.00 1.58%

Using VBA code, how can I parse these rows into columns correctly? Thanks
for your help.




  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,651
Default Data Parsing Problem

On Fri, 23 Dec 2005 08:43:02 -0800, Stratuser
wrote:

I need to parse text data copied onto an Excel worksheet from a big table in
a PDF file. Each line of text contains a stock's ticker symbol, CUSIP
number, company name, number of shares, and other data, in that order. The
problem is that the company name can be anything from one word to five words
separated by a space, and that messes up the parsing into columns. Using the
Data/Text-to-Columns won't work because company name gets split up into
multiple columns so that the number of shares falls in the wrong column. To
illustrate, here's a typical situation, with three lines of text:

XOM 30231G102 EXXON MOBIL CORP 622,900.00 63.5400 39,579,066.00 4.17%
PFE 717081103 PFIZER INC 1,475,900.00 24.9700 36,853,223.00 3.88%
AEP 025537101 AMERICAN ELEC PWR INC 377,300.00 39.7000 14,978,810.00 1.58%

Using VBA code, how can I parse these rows into columns correctly? Thanks
for your help.


In VBA code, you can do the following. I assumed you would define the range to
be parsed by selecting the cells in the column, but you could use other
methods. I also assumed there were always four pieces of data following the
stock name. (Shares, Price, Market Cap, Yield)

===============================================
Option Explicit
Sub StockData()
Dim c As Range
Dim ParsedData As Variant
Dim i As Long, j As Long

For Each c In Selection
ParsedData = Split(c.Text, " ")
i = 3
Do Until i = UBound(ParsedData) - 3
ParsedData(2) = ParsedData(2) & " " & ParsedData(i)
i = i + 1
Loop

For j = 3 To 6
ParsedData(j) = ParsedData(i + j - 3)
Next j

ReDim Preserve ParsedData(6)

For i = 0 To 6
c.Offset(0, i).Value = ParsedData(i)
Next i
Next c
End Sub
=====================================

This task can also be done with regular expressions in worksheet formulas.

For this method, first download and install Longre's free morefunc.xll add-in
from http://xcell05.free.fr/

Then you can use the following formulas:

Ticker: =REGEX.MID(A2,"\w+")
Cusip: =REGEX.MID(A2,"\w+",2)

Name:
=REGEX.MID(REGEX.SUBSTITUTE(A2,"^\w+\s+\w+\s+"),"( \w+\s+)+(?=.*\s.*\s.*\s.*%)")

Shares: =REGEX.MID($A2,"(\d+(,|\.|))+\s",-3)
Price: =REGEX.MID($A2,"(\d+(,|\.|))+\s",-2)
MarketCap: =REGEX.MID($A2,"(\d+(,|\.|))+\s",-1)
Yield: =REGEX.MID($A2,"(\d+(,|\.|))+%",-1)


--ron
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Parsing Problem Gloria Lewis Excel Discussion (Misc queries) 6 January 23rd 06 06:06 PM
Parsing Data Saxman Excel Discussion (Misc queries) 4 December 8th 05 08:31 AM
help with data parsing Param Dhillon Excel Programming 5 February 24th 05 03:29 AM
TypeLib Information Problem? Pass a Function's parameter names as string for parsing? Ali G Excel Programming 1 December 3rd 04 07:02 PM
Parsing Data MGAL Excel Programming 5 January 20th 04 03:08 AM


All times are GMT +1. The time now is 05:48 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"