Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Data Parsing Problem
I need to parse text data copied onto an Excel worksheet from a big table in
a PDF file. Each line of text contains a stock's ticker symbol, CUSIP number, company name, number of shares, and other data, in that order. The problem is that the company name can be anything from one word to five words separated by a space, and that messes up the parsing into columns. Using the Data/Text-to-Columns won't work because company name gets split up into multiple columns so that the number of shares falls in the wrong column. To illustrate, here's a typical situation, with three lines of text: XOM 30231G102 EXXON MOBIL CORP 622,900.00 63.5400 39,579,066.00 4.17% PFE 717081103 PFIZER INC 1,475,900.00 24.9700 36,853,223.00 3.88% AEP 025537101 AMERICAN ELEC PWR INC 377,300.00 39.7000 14,978,810.00 1.58% Using VBA code, how can I parse these rows into columns correctly? Thanks for your help. |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Data Parsing Problem
This worked with your sample date:
Option Explicit Sub fixColumns() Dim rng As Range Dim rng1 As Range Dim rng2 As Range Dim cell As Range, sStr As String Range("A1").CurrentRegion.Columns(1).TextToColumns _ Destination:=Range("A1"), DataType:=xlDelimited, _ TextQualifier:=xlDoubleQuote, _ ConsecutiveDelimiter:=True, Tab:=True, _ Semicolon:=False, Comma:=False, Space:=True, _ Other:=False, FieldInfo:=Array( _ Array(1, 1), Array(2, 1), Array(3, 1), _ Array(4, 1), Array(5, 1), Array(6, 1), _ Array(7, 1), Array(8, 1), Array(9, 1), _ Array(10, 1)), TrailingMinusNumbers:=True Rows(1).Insert Range("A1:H1").Value = Array("H1", "H2", _ "H3", "H4", "H5", "H6", "H7", "H8") ActiveSheet.AutoFilterMode = False Set rng = Intersect(ActiveSheet.Range("A1") _ .CurrentRegion.EntireRow, Columns(8)) rng.AutoFilter Field:=1, Criteria1:="<" Set rng2 = rng.Offset(1, -5).Resize(rng.Rows.Count - 1, 1) On Error Resume Next Set rng1 = rng2.SpecialCells(xlVisible) On Error GoTo 0 Do While Not rng1 Is Nothing For Each cell In rng1 sStr = cell.Value & " " & cell.Offset(0, 1).Value cell.Value = sStr cell.Offset(0, 1).Delete Shift:=xlShiftToLeft Next On Error Resume Next Set rng1 = Nothing ActiveSheet.AutoFilterMode = False rng.AutoFilter Field:=1, Criteria1:="<" Set rng1 = rng2.SpecialCells(xlVisible) On Error GoTo 0 Debug.Print rng.Address, rng2.Address Loop Rows(1).Delete MsgBox "No data in column 8 - done" End Sub -- Regards, Tom Ogilvy "Stratuser" wrote in message ... I need to parse text data copied onto an Excel worksheet from a big table in a PDF file. Each line of text contains a stock's ticker symbol, CUSIP number, company name, number of shares, and other data, in that order. The problem is that the company name can be anything from one word to five words separated by a space, and that messes up the parsing into columns. Using the Data/Text-to-Columns won't work because company name gets split up into multiple columns so that the number of shares falls in the wrong column. To illustrate, here's a typical situation, with three lines of text: XOM 30231G102 EXXON MOBIL CORP 622,900.00 63.5400 39,579,066.00 4.17% PFE 717081103 PFIZER INC 1,475,900.00 24.9700 36,853,223.00 3.88% AEP 025537101 AMERICAN ELEC PWR INC 377,300.00 39.7000 14,978,810.00 1.58% Using VBA code, how can I parse these rows into columns correctly? Thanks for your help. |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Data Parsing Problem
On Fri, 23 Dec 2005 08:43:02 -0800, Stratuser
wrote: I need to parse text data copied onto an Excel worksheet from a big table in a PDF file. Each line of text contains a stock's ticker symbol, CUSIP number, company name, number of shares, and other data, in that order. The problem is that the company name can be anything from one word to five words separated by a space, and that messes up the parsing into columns. Using the Data/Text-to-Columns won't work because company name gets split up into multiple columns so that the number of shares falls in the wrong column. To illustrate, here's a typical situation, with three lines of text: XOM 30231G102 EXXON MOBIL CORP 622,900.00 63.5400 39,579,066.00 4.17% PFE 717081103 PFIZER INC 1,475,900.00 24.9700 36,853,223.00 3.88% AEP 025537101 AMERICAN ELEC PWR INC 377,300.00 39.7000 14,978,810.00 1.58% Using VBA code, how can I parse these rows into columns correctly? Thanks for your help. In VBA code, you can do the following. I assumed you would define the range to be parsed by selecting the cells in the column, but you could use other methods. I also assumed there were always four pieces of data following the stock name. (Shares, Price, Market Cap, Yield) =============================================== Option Explicit Sub StockData() Dim c As Range Dim ParsedData As Variant Dim i As Long, j As Long For Each c In Selection ParsedData = Split(c.Text, " ") i = 3 Do Until i = UBound(ParsedData) - 3 ParsedData(2) = ParsedData(2) & " " & ParsedData(i) i = i + 1 Loop For j = 3 To 6 ParsedData(j) = ParsedData(i + j - 3) Next j ReDim Preserve ParsedData(6) For i = 0 To 6 c.Offset(0, i).Value = ParsedData(i) Next i Next c End Sub ===================================== This task can also be done with regular expressions in worksheet formulas. For this method, first download and install Longre's free morefunc.xll add-in from http://xcell05.free.fr/ Then you can use the following formulas: Ticker: =REGEX.MID(A2,"\w+") Cusip: =REGEX.MID(A2,"\w+",2) Name: =REGEX.MID(REGEX.SUBSTITUTE(A2,"^\w+\s+\w+\s+"),"( \w+\s+)+(?=.*\s.*\s.*\s.*%)") Shares: =REGEX.MID($A2,"(\d+(,|\.|))+\s",-3) Price: =REGEX.MID($A2,"(\d+(,|\.|))+\s",-2) MarketCap: =REGEX.MID($A2,"(\d+(,|\.|))+\s",-1) Yield: =REGEX.MID($A2,"(\d+(,|\.|))+%",-1) --ron |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Parsing Problem | Excel Discussion (Misc queries) | |||
Parsing Data | Excel Discussion (Misc queries) | |||
help with data parsing | Excel Programming | |||
TypeLib Information Problem? Pass a Function's parameter names as string for parsing? | Excel Programming | |||
Parsing Data | Excel Programming |