Thread: Parse csv files
View Single Post
  #5   Report Post  
Posted to microsoft.public.excel.misc
Bryan Hessey
 
Posts: n/a
Default Parse csv files


Beyond my scope, but
http://www.creativyst.com/Doc/Articl...m#CSVariations and
it's comments on Excel might suggest that it doesn't.

--

rob Wrote:
Bryan,

This is also my understanding of csv files (though some people say
that
for double quotes you don't need double double quotes). In any case,
the problem is that most cells use double double quotes for double
quotes. Unfortunately, some cells don't seem to be formated right and
don't enclose double quotes into double double quotes as outlined in
my
example. I know that Excel can load it just fine but with odbc it does
not work. Any idea if the parsing algorithm used by Excel is somehow
accessible through the .NET framework (I was hoping Excel uses odbc).

Thanks


Bryan Hessey wrote:
My understanding is that a .csv file has data fields separated by a
comma character, and if the field contains a comma then that field

is
enclosed in quotes (being double-quotes), and if the field contains
quotes then those are indicated by two consecutive quotes.

The site http://en.wikipedia.org/wiki/Comma-separated_values perhaps
better explains .csv files, and has a pointer to required drivers.

Hope this helps.

--

rob Wrote:
Here is what I want to do. From the internet I download some data

that
are in csv format. All data will be in one long string. Now I need

to
extract every cell. The problem ist that some cell content

contains
commas and/or double quotes. Some of the cell contents that

contain
double quotes use double double quotes and others don't, i.e. some
look
like

"this "item" is bad", "this item is ok"

and others like

"this ""item"" is bad", "this item is ok"

There is also a chance that some cell are in double quotes (if

they
contains commas or double quotes) and others are not in double

quotes
(if they do not contain commas or double quotes). Considering all

this
(and possibly more stuff) parsing becomes non trivial.

As a first approach I stored the content downloaded into a file

and
then use odbc like this:

connectionString = @"Driver={Microsoft Text Driver (*.txt;
*.csv)};DBQ=" + Path.GetDirectoryName(filename);
connection = new OdbcConnection(connectionString);
connection.Open();
command = new OdbcCommand("Select * FROM " +
Path.GetFileName(filename), connection);
reader = command.ExecuteReader();

Unfortunately, this approach does not work for the above

scenarios.
Excel reads the files in question just fine, though. So my question

is
what is the best approach to read csv files, preferably without

having
to create temporary files?



--
Bryan Hessey
------------------------------------------------------------------------
Bryan Hessey's Profile: http://www.excelforum.com/member.php...o&userid=21059
View this thread: http://www.excelforum.com/showthread...hreadid=537641