Home |
Search |
Today's Posts |
#1
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel.misc
|
|||
|
|||
![]()
Here is what I want to do. From the internet I download some data that
are in csv format. All data will be in one long string. Now I need to extract every cell. The problem ist that some cell content contains commas and/or double quotes. Some of the cell contents that contain double quotes use double double quotes and others don't, i.e. some look like "this "item" is bad", "this item is ok" and others like "this ""item"" is bad", "this item is ok" There is also a chance that some cell are in double quotes (if they contains commas or double quotes) and others are not in double quotes (if they do not contain commas or double quotes). Considering all this (and possibly more stuff) parsing becomes non trivial. As a first approach I stored the content downloaded into a file and then use odbc like this: connectionString = @"Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=" + Path.GetDirectoryName(filename); connection = new OdbcConnection(connectionString); connection.Open(); command = new OdbcCommand("Select * FROM " + Path.GetFileName(filename), connection); reader = command.ExecuteReader(); Unfortunately, this approach does not work for the above scenarios. Excel reads the files in question just fine, though. So my question is what is the best approach to read csv files, preferably without having to create temporary files? Thanks |
#2
![]()
Posted to microsoft.public.excel.misc,microsoft.public.excel.programming
|
|||
|
|||
![]()
An (Excel) CSV file is:
1. simply a text file with columns delimited by a comma 2. simply a text file with columns delimited by a comma and text values enclosed in double quotes. In 1, a missing value is indicated by two contiguous commas, and in 2. a missing value is indicated by ,"", Your CSV file is much more complicated (dos not follow the patter 1 or 2) and you appear to be using ADO.NET. Although you have ruled it out, you might have to pre-process your file as 1 or 2 above before you can execute SQL queries. You might find that when the SQL query works, it ignores the first row (unless it contains or you add column names in the first row). "rob" wrote: Here is what I want to do. From the internet I download some data that are in csv format. All data will be in one long string. Now I need to extract every cell. The problem ist that some cell content contains commas and/or double quotes. Some of the cell contents that contain double quotes use double double quotes and others don't, i.e. some look like "this "item" is bad", "this item is ok" and others like "this ""item"" is bad", "this item is ok" There is also a chance that some cell are in double quotes (if they contains commas or double quotes) and others are not in double quotes (if they do not contain commas or double quotes). Considering all this (and possibly more stuff) parsing becomes non trivial. As a first approach I stored the content downloaded into a file and then use odbc like this: connectionString = @"Driver={Microsoft Text Driver (*.txt; *.csv)};DBQ=" + Path.GetDirectoryName(filename); connection = new OdbcConnection(connectionString); connection.Open(); command = new OdbcCommand("Select * FROM " + Path.GetFileName(filename), connection); reader = command.ExecuteReader(); Unfortunately, this approach does not work for the above scenarios. Excel reads the files in question just fine, though. So my question is what is the best approach to read csv files, preferably without having to create temporary files? Thanks |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
csv files do not parse and open up like they should | Excel Discussion (Misc queries) | |||
Parse from the Right | Excel Worksheet Functions | |||
Parse csv files | Excel Discussion (Misc queries) | |||
Parse database into several files based on column D | Excel Programming | |||
Parse | Excel Programming |