Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel.misc
rob rob is offline
external usenet poster
 
Posts: 1
Default Parse csv files

Here is what I want to do. From the internet I download some data that
are in csv format. All data will be in one long string. Now I need to
extract every cell. The problem ist that some cell content contains
commas and/or double quotes. Some of the cell contents that contain
double quotes use double double quotes and others don't, i.e. some look
like

"this "item" is bad", "this item is ok"

and others like

"this ""item"" is bad", "this item is ok"

There is also a chance that some cell are in double quotes (if they
contains commas or double quotes) and others are not in double quotes
(if they do not contain commas or double quotes). Considering all this
(and possibly more stuff) parsing becomes non trivial.

As a first approach I stored the content downloaded into a file and
then use odbc like this:

connectionString = @"Driver={Microsoft Text Driver (*.txt;
*.csv)};DBQ=" + Path.GetDirectoryName(filename);
connection = new OdbcConnection(connectionString);
connection.Open();
command = new OdbcCommand("Select * FROM " +
Path.GetFileName(filename), connection);
reader = command.ExecuteReader();

Unfortunately, this approach does not work for the above scenarios.
Excel reads the files in question just fine, though. So my question is
what is the best approach to read csv files, preferably without having
to create temporary files?

Thanks

  #2   Report Post  
Posted to microsoft.public.excel.misc,microsoft.public.excel.programming
external usenet poster
 
Posts: 400
Default Parse csv files

An (Excel) CSV file is:
1. simply a text file with columns delimited by a comma
2. simply a text file with columns delimited by a comma and text values
enclosed in double quotes.
In 1, a missing value is indicated by two contiguous commas, and in 2. a
missing value is indicated by ,"",
Your CSV file is much more complicated (dos not follow the patter 1 or 2)
and you appear to be using ADO.NET. Although you have ruled it out, you might
have to pre-process your file as 1 or 2 above before you can execute SQL
queries. You might find that when the SQL query works, it ignores the first
row (unless it contains or you add column names in the first row).



"rob" wrote:

Here is what I want to do. From the internet I download some data that
are in csv format. All data will be in one long string. Now I need to
extract every cell. The problem ist that some cell content contains
commas and/or double quotes. Some of the cell contents that contain
double quotes use double double quotes and others don't, i.e. some look
like

"this "item" is bad", "this item is ok"

and others like

"this ""item"" is bad", "this item is ok"

There is also a chance that some cell are in double quotes (if they
contains commas or double quotes) and others are not in double quotes
(if they do not contain commas or double quotes). Considering all this
(and possibly more stuff) parsing becomes non trivial.

As a first approach I stored the content downloaded into a file and
then use odbc like this:

connectionString = @"Driver={Microsoft Text Driver (*.txt;
*.csv)};DBQ=" + Path.GetDirectoryName(filename);
connection = new OdbcConnection(connectionString);
connection.Open();
command = new OdbcCommand("Select * FROM " +
Path.GetFileName(filename), connection);
reader = command.ExecuteReader();

Unfortunately, this approach does not work for the above scenarios.
Excel reads the files in question just fine, though. So my question is
what is the best approach to read csv files, preferably without having
to create temporary files?

Thanks


Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
csv files do not parse and open up like they should Trolane Excel Discussion (Misc queries) 0 July 1st 08 06:02 PM
Parse from the Right PA Excel Worksheet Functions 6 June 11th 06 06:05 PM
Parse csv files rob Excel Discussion (Misc queries) 11 May 3rd 06 08:38 AM
Parse database into several files based on column D Steph[_3_] Excel Programming 3 September 28th 04 03:01 AM
Parse Sarah[_4_] Excel Programming 2 December 30th 03 11:16 PM


All times are GMT +1. The time now is 06:14 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"