Thread: Parse csv files
View Single Post
  #1   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel.misc
rob
 
Posts: n/a
Default Parse csv files

Here is what I want to do. From the internet I download some data that
are in csv format. All data will be in one long string. Now I need to
extract every cell. The problem ist that some cell content contains
commas and/or double quotes. Some of the cell contents that contain
double quotes use double double quotes and others don't, i.e. some look
like

"this "item" is bad", "this item is ok"

and others like

"this ""item"" is bad", "this item is ok"

There is also a chance that some cell are in double quotes (if they
contains commas or double quotes) and others are not in double quotes
(if they do not contain commas or double quotes). Considering all this
(and possibly more stuff) parsing becomes non trivial.

As a first approach I stored the content downloaded into a file and
then use odbc like this:

connectionString = @"Driver={Microsoft Text Driver (*.txt;
*.csv)};DBQ=" + Path.GetDirectoryName(filename);
connection = new OdbcConnection(connectionString);
connection.Open();
command = new OdbcCommand("Select * FROM " +
Path.GetFileName(filename), connection);
reader = command.ExecuteReader();

Unfortunately, this approach does not work for the above scenarios.
Excel reads the files in question just fine, though. So my question is
what is the best approach to read csv files, preferably without having
to create temporary files?

Thanks