Thread: Parse csv files
View Single Post
  #6   Report Post  
Posted to microsoft.public.excel.misc,microsoft.public.excel.programming
AA2e72E
 
Posts: n/a
Default Parse csv files

An (Excel) CSV file is:
1. simply a text file with columns delimited by a comma
2. simply a text file with columns delimited by a comma and text values
enclosed in double quotes.
In 1, a missing value is indicated by two contiguous commas, and in 2. a
missing value is indicated by ,"",
Your CSV file is much more complicated (dos not follow the patter 1 or 2)
and you appear to be using ADO.NET. Although you have ruled it out, you might
have to pre-process your file as 1 or 2 above before you can execute SQL
queries. You might find that when the SQL query works, it ignores the first
row (unless it contains or you add column names in the first row).



"rob" wrote:

Here is what I want to do. From the internet I download some data that
are in csv format. All data will be in one long string. Now I need to
extract every cell. The problem ist that some cell content contains
commas and/or double quotes. Some of the cell contents that contain
double quotes use double double quotes and others don't, i.e. some look
like

"this "item" is bad", "this item is ok"

and others like

"this ""item"" is bad", "this item is ok"

There is also a chance that some cell are in double quotes (if they
contains commas or double quotes) and others are not in double quotes
(if they do not contain commas or double quotes). Considering all this
(and possibly more stuff) parsing becomes non trivial.

As a first approach I stored the content downloaded into a file and
then use odbc like this:

connectionString = @"Driver={Microsoft Text Driver (*.txt;
*.csv)};DBQ=" + Path.GetDirectoryName(filename);
connection = new OdbcConnection(connectionString);
connection.Open();
command = new OdbcCommand("Select * FROM " +
Path.GetFileName(filename), connection);
reader = command.ExecuteReader();

Unfortunately, this approach does not work for the above scenarios.
Excel reads the files in question just fine, though. So my question is
what is the best approach to read csv files, preferably without having
to create temporary files?

Thanks