Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^5: Comparison of the parsing features of CSV (and xSV) modules

by Wally Hartshorn (Hermit)
on Jun 16, 2004 at 16:13 UTC ( [id://367290]=note: print w/replies, xml ) Need Help??


in reply to Re^4: Comparison of the parsing features of CSV (and xSV) modules
in thread Comparison of the parsing features of CSV (and xSV) modules

And, what should the parser do with the following:
"Smith","John",12/31/1962,"Author of "How to Break Programs" and other + books,"Bugger" "Smith","John",12/31/1962,"Author of ""How to Break Programs"" and oth +er books,"Bugger"
"Smith","John",12/31/1962,Author of "How to Break Programs" and other +books,"Bugger" "Smith","John",12/31/1962,Author of ""How to Break Programs"" and othe +r books,"Bugger"
"Smith","John",12/31/1962,'Author of "How to Break Programs" and other + books,"Bugger" (Reject?)
"Smith","John",12/31/1962,'Author of "How to Break Programs" and other + books',"Bugger" (Reject?)

(I haven't encountered any improperly quoted data, just data that doesn't escape embedded delimiters.)

Wally Hartshorn

Replies are listed 'Best First'.
Re^6: Comparison of the parsing features of CSV (and xSV) modules
by dragonchild (Archbishop) on Jun 16, 2004 at 16:37 UTC
    What about the following:
    abcd,"efgh,"ijkl,"mnop",qrst
    Is that malformed or is that meant to be
    abcd,"efgh,""ijkl,""mnop",qrst

    The issue is that there are too many edge cases for a general-purpose parser to handle. I'm coming up with a bunch and I'm not even trying hard.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

    I shouldn't have to say this, but any code, unless otherwise stated, is untested

      Ah! Now I know what you're asking. Well, my point wasn't that CSV handlers need to be able to handle every possible bit of crud that is thrown at them. I was just saying that it would be useful if they would handle unescaped embedded delimiters -- perhaps not absolutely dirty data, but at least somewhat dusty data. :-)

      Wally Hartshorn

        Heh. Define "dusty". We can tell, but we have to be able to define it to a teddybear in order to have a computer handle it correctly. :-)

        ------
        We are the carpenters and bricklayers of the Information Age.

        Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

        I shouldn't have to say this, but any code, unless otherwise stated, is untested

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://367290]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2024-04-25 13:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found