cavac has asked for the wisdom of the Perl Monks concerning the following question:
Hi
I have to parse a text file (comma seperated fields), more or less a standard CSV file. The exception here is that one of the fields can have multiple lines.
I know this is a bad design, but can't change it since it's from an old data export.
The file looks somewhat like this (simplified to highlight the problem):
"1", "title1", "hello world", "foo" "2", "title2", "hallo welt", "bar" "3", "title3", "this is a very long line", "baz"
To add to the problem, the file can be a few gigs in size and has quoted characters as well.
I know i'm capable of writing a parser on my own, using a simple state machine (been there, done that, got the headache). But i rather use a tried and tested method than spending the next ten days hunting for obscure bugs.
Can you recommend me a module that works on this specific file format?. The goal here is to extract the data fields line-by-line and put them into a database.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Parsing CSV with multiline fields
by Tux (Canon) on Sep 02, 2011 at 14:47 UTC | |
|
Re: Parsing CSV with multiline fields
by Ratazong (Monsignor) on Sep 02, 2011 at 14:48 UTC | |
by AnomalousMonk (Archbishop) on Sep 02, 2011 at 15:57 UTC | |
by cavac (Prior) on Sep 02, 2011 at 15:32 UTC |