Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Extract matching data from large data file (csv)
I need to extract data (the first three letters after the 2nd comma, and the
first 35 characters after the 7th comma) from a csv file (over 100,000 rows), only after the 8th column matches a values in column A of my spreadsheet. The two extracted data elements need to be stored in my worksheet in columns B and C Any help on this would be appreciated. |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Extract matching data from large data file (csv)
The code below should work. It may need a slight change. I a little
confused. The eigth column is the data after the seventh comma. Do you mean after the eigth comma? the code below will only bring into the worksheet the needed data. It will ignore data that doesn't meet your criteria. Sub Gettext() Const ForReading = 1, ForWriting = 2, ForAppending = 3 Const TristateUseDefault = -2, TristateTrue = -1, TristateFalse = 0 Dim Data(8) 'default folder Folder = "C:\temp" ChDir (Folder) Set fsread = CreateObject("Scripting.FileSystemObject") FName = Application.GetOpenFilename("CSV (*.csv),*.csv") Set fread = fsread.GetFile(FName) Set tsread = fread.OpenAsTextStream(ForReading, TristateUseDefault) RowCount = 1 Do While tsread.atendofstream = False InputLine = tsread.ReadLine For i = 0 To 7 If InStr(InputLine, ",") 0 Then Data(i) = Left(InputLine, InStr(InputLine, ",") - 1) InputLine = Mid(InputLine, InStr(InputLine, ",") + 1) Else If Len(InputLine) 0 Then Data(i) = InputLine InputLine = "" Else Exit For End If End If Next i 'check if 8th item is in column A Set c = Columns("A:A").Find(what:=Data(7), LookIn:=xlValues, _ lookat:=xlWhole) If Not c Is Nothing Then c.Offset(0, 1) = Left(Data(2), 3) c.Offset(0, 2) = Left(Data(7), 35) End If Loop tsread.Close End Sub "Utahstew" wrote: I need to extract data (the first three letters after the 2nd comma, and the first 35 characters after the 7th comma) from a csv file (over 100,000 rows), only after the 8th column matches a values in column A of my spreadsheet. The two extracted data elements need to be stored in my worksheet in columns B and C Any help on this would be appreciated. |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Extract matching data from large data file (csv)
Thank you for your quick response.
The code works great until it finds an "&" in Data(7) Any suggestion on how to over come this ...? "Joel" wrote: The code below should work. It may need a slight change. I a little confused. The eigth column is the data after the seventh comma. Do you mean after the eigth comma? the code below will only bring into the worksheet the needed data. It will ignore data that doesn't meet your criteria. Sub Gettext() Const ForReading = 1, ForWriting = 2, ForAppending = 3 Const TristateUseDefault = -2, TristateTrue = -1, TristateFalse = 0 Dim Data(8) 'default folder Folder = "C:\temp" ChDir (Folder) Set fsread = CreateObject("Scripting.FileSystemObject") FName = Application.GetOpenFilename("CSV (*.csv),*.csv") Set fread = fsread.GetFile(FName) Set tsread = fread.OpenAsTextStream(ForReading, TristateUseDefault) RowCount = 1 Do While tsread.atendofstream = False InputLine = tsread.ReadLine For i = 0 To 7 If InStr(InputLine, ",") 0 Then Data(i) = Left(InputLine, InStr(InputLine, ",") - 1) InputLine = Mid(InputLine, InStr(InputLine, ",") + 1) Else If Len(InputLine) 0 Then Data(i) = InputLine InputLine = "" Else Exit For End If End If Next i 'check if 8th item is in column A Set c = Columns("A:A").Find(what:=Data(7), LookIn:=xlValues, _ lookat:=xlWhole) If Not c Is Nothing Then c.Offset(0, 1) = Left(Data(2), 3) c.Offset(0, 2) = Left(Data(7), 35) End If Loop tsread.Close End Sub "Utahstew" wrote: I need to extract data (the first three letters after the 2nd comma, and the first 35 characters after the 7th comma) from a csv file (over 100,000 rows), only after the 8th column matches a values in column A of my spreadsheet. The two extracted data elements need to be stored in my worksheet in columns B and C Any help on this would be appreciated. |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
Extract matching data from large data file (csv)
I was wrong other "&" exist in the lines previously
|
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
Extract matching data from large data file (csv)
Joel,
Here is the modified code I have in the spreadsheet. It throws a Run-time error '9': Subscript out of range when reading data from line 3073 of the Total.csv file. Equity 93671105 NYSE - H&R Block Inc 20080102 18.61 US0936711052 H&R Block Inc HRB 93671105 Sub getDatafromTextFile() Dim Ticker As String tickers = Application.CountA(ActiveSheet.Range("A:A")) 'determines number of Tickers Const ForReading = 1 Const ForWriting = 2 Const ForAppending = 3 Const TristateUSeDefault = -2 Const TristateTrue = -1 Const TristateFalse = 0 Dim Data(9) 'Default folder Set fsread = CreateObject("Scripting.fileSystemObject") Fname = Application.GetOpenFilename(FileFilter:="Text Files (*.csv), *.csv", Title:="Please select a file") If Fname = False Then ' They pressed Cancel MsgBox "Stopping because you did not select a file" End If Set fread = fsread.getfile(Fname) Set tsread = fread.openastextstream(ForReading, TristateUSeDefault) RowCount = 1 Do While tsread.atendofstream = False inputline = tsread.readline For i = 0 To 10 If InStr(inputline, ",") 0 Then Data(i) = Left(inputline, InStr(inputline, ",") - 1) inputline = Mid(inputline, InStr(inputline, ",") + 1) Else If Len(inputline) 0 Then Data(i) = inputline inputline = "" Else Exit For End If End If Next i Set c = Columns("A:A").Find(what:=Trim(Data(8)), LookIn:=xlValues, lookat:=xlWhole) If Not c Is Nothing Then c.Offset(0, 1) = Left(Data(2), 3) c.Offset(0, 2) = Left(Data(7), 35) End If Loop tsread.Close End Sub |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
Extract matching data from large data file (csv)
The lies that have problems have extra comma at the end. The following line work in the code Equity,93679108,NYSE - Blockbuster Inc Class A,20080102,3.64,,US0936791088,Blockbuster Inc Class A,BBI,93679108 The following line does not work in the code Equity,9.26E+104,NYSE - Enh Eq Yd Pr Fd,20080102,15.63,,US09256E1010,Enh Eq Yd Pr Fd,ECV,9.26E+104,, Notice the final two comma How do I limit theinput line to only the first ten columns and disregard the rest? |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
Extract matching data from large data file (csv)
I'm not sure why you modified the size of the Data array and the size of the
FOR loop. You only need 8 items. Why go through the loop additional times and get data you don't need. The original code would of just ignored the other data in the line. The error is occuring becuase the for loop is now 0 to 10 which is 11 items. Dat is defoined as Data(9) which is nine items. The error is occuring becuase you are putting more items into the array than it can hold. "Utahstew" wrote: The lies that have problems have extra comma at the end. The following line work in the code Equity,93679108,NYSE - Blockbuster Inc Class A,20080102,3.64,,US0936791088,Blockbuster Inc Class A,BBI,93679108 The following line does not work in the code Equity,9.26E+104,NYSE - Enh Eq Yd Pr Fd,20080102,15.63,,US09256E1010,Enh Eq Yd Pr Fd,ECV,9.26E+104,, Notice the final two comma How do I limit theinput line to only the first ten columns and disregard the rest? |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
Extract matching data from large data file (csv)
Thank you Joel, your fast response is appreciated
I had forgotten arrays start at 0. |
#9
Posted to microsoft.public.excel.programming
|
|||
|
|||
Extract matching data from large data file (csv)
I think you should add an exit sub in the code below. You are going to get
an error if you don't make the change. from Title:="Please select a file") If Fname = False Then ' They pressed Cancel MsgBox "Stopping because you did not select a file" End If to Title:="Please select a file") If Fname = False Then ' They pressed Cancel MsgBox "Stopping because you did not select a file" exit sub End If "Utahstew" wrote: Thank you Joel, your fast response is appreciated I had forgotten arrays start at 0. |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Large Data File Consolidation... | Excel Worksheet Functions | |||
Very large file - simple data | Excel Discussion (Misc queries) | |||
Importing (too) large data file | Excel Discussion (Misc queries) | |||
Using condition to extract data by matching values from 2 different Excel files | Excel Programming | |||
Extract only non-matching data | Excel Programming |