Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Old April 13th 08, 06:37 AM posted to microsoft.public.excel.programming
external usenet poster
 
First recorded activity by ExcelBanter: Apr 2008
Posts: 6
Default Extract matching data from large data file (csv)

I need to extract data (the first three letters after the 2nd comma, and the
first 35 characters after the 7th comma) from a csv file (over 100,000 rows),
only after the 8th column matches a values in column A of my spreadsheet. The
two extracted data elements need to be stored in my worksheet in columns B
and C

Any help on this would be appreciated.



  #2   Report Post  
Old April 13th 08, 07:26 AM posted to microsoft.public.excel.programming
external usenet poster
 
First recorded activity by ExcelBanter: Jul 2006
Posts: 9,101
Default Extract matching data from large data file (csv)

The code below should work. It may need a slight change. I a little
confused. The eigth column is the data after the seventh comma. Do you mean
after the eigth comma?

the code below will only bring into the worksheet the needed data. It will
ignore data that doesn't meet your criteria.

Sub Gettext()

Const ForReading = 1, ForWriting = 2, ForAppending = 3
Const TristateUseDefault = -2, TristateTrue = -1, TristateFalse = 0

Dim Data(8)

'default folder
Folder = "C:\temp"
ChDir (Folder)

Set fsread = CreateObject("Scripting.FileSystemObject")
FName = Application.GetOpenFilename("CSV (*.csv),*.csv")

Set fread = fsread.GetFile(FName)
Set tsread = fread.OpenAsTextStream(ForReading, TristateUseDefault)

RowCount = 1
Do While tsread.atendofstream = False

InputLine = tsread.ReadLine

For i = 0 To 7
If InStr(InputLine, ",") 0 Then
Data(i) = Left(InputLine, InStr(InputLine, ",") - 1)
InputLine = Mid(InputLine, InStr(InputLine, ",") + 1)
Else
If Len(InputLine) 0 Then
Data(i) = InputLine
InputLine = ""
Else
Exit For
End If
End If
Next i
'check if 8th item is in column A
Set c = Columns("A:A").Find(what:=Data(7), LookIn:=xlValues, _
lookat:=xlWhole)
If Not c Is Nothing Then
c.Offset(0, 1) = Left(Data(2), 3)
c.Offset(0, 2) = Left(Data(7), 35)
End If
Loop
tsread.Close
End Sub



"Utahstew" wrote:

I need to extract data (the first three letters after the 2nd comma, and the
first 35 characters after the 7th comma) from a csv file (over 100,000 rows),
only after the 8th column matches a values in column A of my spreadsheet. The
two extracted data elements need to be stored in my worksheet in columns B
and C

Any help on this would be appreciated.


  #3   Report Post  
Old April 13th 08, 05:42 PM posted to microsoft.public.excel.programming
external usenet poster
 
First recorded activity by ExcelBanter: Apr 2008
Posts: 6
Default Extract matching data from large data file (csv)

Thank you for your quick response.

The code works great until it finds an "&" in Data(7)

Any suggestion on how to over come this ...?


"Joel" wrote:

The code below should work. It may need a slight change. I a little
confused. The eigth column is the data after the seventh comma. Do you mean
after the eigth comma?

the code below will only bring into the worksheet the needed data. It will
ignore data that doesn't meet your criteria.

Sub Gettext()

Const ForReading = 1, ForWriting = 2, ForAppending = 3
Const TristateUseDefault = -2, TristateTrue = -1, TristateFalse = 0

Dim Data(8)

'default folder
Folder = "C:\temp"
ChDir (Folder)

Set fsread = CreateObject("Scripting.FileSystemObject")
FName = Application.GetOpenFilename("CSV (*.csv),*.csv")

Set fread = fsread.GetFile(FName)
Set tsread = fread.OpenAsTextStream(ForReading, TristateUseDefault)

RowCount = 1
Do While tsread.atendofstream = False

InputLine = tsread.ReadLine

For i = 0 To 7
If InStr(InputLine, ",") 0 Then
Data(i) = Left(InputLine, InStr(InputLine, ",") - 1)
InputLine = Mid(InputLine, InStr(InputLine, ",") + 1)
Else
If Len(InputLine) 0 Then
Data(i) = InputLine
InputLine = ""
Else
Exit For
End If
End If
Next i
'check if 8th item is in column A
Set c = Columns("A:A").Find(what:=Data(7), LookIn:=xlValues, _
lookat:=xlWhole)
If Not c Is Nothing Then
c.Offset(0, 1) = Left(Data(2), 3)
c.Offset(0, 2) = Left(Data(7), 35)
End If
Loop
tsread.Close
End Sub



"Utahstew" wrote:

I need to extract data (the first three letters after the 2nd comma, and the
first 35 characters after the 7th comma) from a csv file (over 100,000 rows),
only after the 8th column matches a values in column A of my spreadsheet. The
two extracted data elements need to be stored in my worksheet in columns B
and C

Any help on this would be appreciated.


  #4   Report Post  
Old April 13th 08, 06:03 PM posted to microsoft.public.excel.programming
external usenet poster
 
First recorded activity by ExcelBanter: Apr 2008
Posts: 6
Default Extract matching data from large data file (csv)

I was wrong other "&" exist in the lines previously
  #5   Report Post  
Old April 13th 08, 06:14 PM posted to microsoft.public.excel.programming
external usenet poster
 
First recorded activity by ExcelBanter: Apr 2008
Posts: 6
Default Extract matching data from large data file (csv)

Joel,
Here is the modified code I have in the spreadsheet. It throws a Run-time
error '9': Subscript out of range when reading data from line 3073 of the
Total.csv file.

Equity 93671105 NYSE - H&R Block Inc 20080102 18.61 US0936711052 H&R Block
Inc HRB 93671105


Sub getDatafromTextFile()
Dim Ticker As String

tickers = Application.CountA(ActiveSheet.Range("A:A")) 'determines number of
Tickers

Const ForReading = 1
Const ForWriting = 2
Const ForAppending = 3
Const TristateUSeDefault = -2
Const TristateTrue = -1
Const TristateFalse = 0

Dim Data(9)

'Default folder

Set fsread = CreateObject("Scripting.fileSystemObject")
Fname = Application.GetOpenFilename(FileFilter:="Text Files (*.csv), *.csv",
Title:="Please select a file")
If Fname = False Then
' They pressed Cancel
MsgBox "Stopping because you did not select a file"
End If

Set fread = fsread.getfile(Fname)
Set tsread = fread.openastextstream(ForReading, TristateUSeDefault)

RowCount = 1


Do While tsread.atendofstream = False

inputline = tsread.readline

For i = 0 To 10
If InStr(inputline, ",") 0 Then
Data(i) = Left(inputline, InStr(inputline, ",") - 1)
inputline = Mid(inputline, InStr(inputline, ",") + 1)
Else
If Len(inputline) 0 Then
Data(i) = inputline
inputline = ""
Else
Exit For
End If
End If

Next i

Set c = Columns("A:A").Find(what:=Trim(Data(8)), LookIn:=xlValues,
lookat:=xlWhole)
If Not c Is Nothing Then
c.Offset(0, 1) = Left(Data(2), 3)
c.Offset(0, 2) = Left(Data(7), 35)
End If
Loop
tsread.Close
End Sub


  #6   Report Post  
Old April 13th 08, 07:01 PM posted to microsoft.public.excel.programming
external usenet poster
 
First recorded activity by ExcelBanter: Apr 2008
Posts: 6
Default Extract matching data from large data file (csv)


The lies that have problems have extra comma at the end.

The following line work in the code
Equity,93679108,NYSE - Blockbuster Inc Class
A,20080102,3.64,,US0936791088,Blockbuster Inc Class A,BBI,93679108

The following line does not work in the code
Equity,9.26E+104,NYSE - Enh Eq Yd Pr Fd,20080102,15.63,,US09256E1010,Enh Eq
Yd Pr Fd,ECV,9.26E+104,,

Notice the final two comma

How do I limit theinput line to only the first ten columns and disregard the
rest?
  #7   Report Post  
Old April 13th 08, 08:10 PM posted to microsoft.public.excel.programming
external usenet poster
 
First recorded activity by ExcelBanter: Jul 2006
Posts: 9,101
Default Extract matching data from large data file (csv)

I'm not sure why you modified the size of the Data array and the size of the
FOR loop. You only need 8 items. Why go through the loop additional times
and get data you don't need. The original code would of just ignored the
other data in the line.

The error is occuring becuase the for loop is now 0 to 10 which is 11 items.
Dat is defoined as Data(9) which is nine items. The error is occuring
becuase you are putting more items into the array than it can hold.

"Utahstew" wrote:


The lies that have problems have extra comma at the end.

The following line work in the code
Equity,93679108,NYSE - Blockbuster Inc Class
A,20080102,3.64,,US0936791088,Blockbuster Inc Class A,BBI,93679108

The following line does not work in the code
Equity,9.26E+104,NYSE - Enh Eq Yd Pr Fd,20080102,15.63,,US09256E1010,Enh Eq
Yd Pr Fd,ECV,9.26E+104,,

Notice the final two comma

How do I limit theinput line to only the first ten columns and disregard the
rest?

  #8   Report Post  
Old April 13th 08, 08:25 PM posted to microsoft.public.excel.programming
external usenet poster
 
First recorded activity by ExcelBanter: Apr 2008
Posts: 6
Default Extract matching data from large data file (csv)

Thank you Joel, your fast response is appreciated

I had forgotten arrays start at 0.


  #9   Report Post  
Old April 14th 08, 03:57 AM posted to microsoft.public.excel.programming
external usenet poster
 
First recorded activity by ExcelBanter: Jul 2006
Posts: 9,101
Default Extract matching data from large data file (csv)

I think you should add an exit sub in the code below. You are going to get
an error if you don't make the change.

from
Title:="Please select a file")
If Fname = False Then
' They pressed Cancel
MsgBox "Stopping because you did not select a file"
End If
to
Title:="Please select a file")
If Fname = False Then
' They pressed Cancel
MsgBox "Stopping because you did not select a file"
exit sub
End If


"Utahstew" wrote:

Thank you Joel, your fast response is appreciated

I had forgotten arrays start at 0.




Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Large Data File Consolidation... divrdrew Excel Worksheet Functions 2 January 25th 12 10:45 PM
Very large file - simple data Erin Excel Discussion (Misc queries) 3 June 5th 08 03:51 AM
Importing (too) large data file Ian Excel Discussion (Misc queries) 1 August 9th 06 07:45 AM
Using condition to extract data by matching values from 2 different Excel files schellam Excel Programming 0 November 29th 05 09:43 PM
Extract only non-matching data vect98 Excel Programming 6 August 25th 05 02:11 AM


All times are GMT +1. The time now is 09:19 PM.

Powered by vBulletin® Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.
Copyright 2004-2022 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"

 

Copyright © 2017