Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on a project that uses CSV files to store data (unfortunately it's the only option :(). I want to ensure that there aren't any problems with unescaped symbols ('|' in this case) so I need to escape them to their character codes. Is there a standard perl-only module for this? What do other monks use? Should I escape all character entities or just the delimiter? Thank you for your advice.

Replies are listed 'Best First'.
Re: Escaping data for CSV files
by valdez (Monsignor) on Jun 21, 2003 at 12:49 UTC

    The most powerful tool of Perl is CPAN, there you can find Text::CSV that does exactly what you need.

    Ciao, Valerio

      It's still at version 0.01 and was released in 1997? Have you found it to be stable? Thanks.

        CPAN Testers says that the module passes all tests, but there is also a known bug, you can see it on Text::CSV page. I don't know if the module is still being mantained by its author.

        Ciao, Valerio

        Yup, I use it pretty regularly and it works fine for most things. It can get slow parsing large records (large = 1000s of characters), but I haven't found any bugs or other "instabilities" with it.

        -Tats
Re: Escaping data for CSV files
by grantm (Parson) on Jun 21, 2003 at 19:39 UTC

    I don't think you need to go overboard with escaping things. The "|" symbol for example does not need escaping in a CSV file. The common form of CSV as understood by say Excel works like this:

    • If a field contains a comma, a new line (yes embedded newlines are legal) or a double quote character then the whole field must be enclosed in double quotes.
    • Any double quote characters in a field must be 'escaped' by preceding them with a another double quote (ie: " becomes "")
Re: Escaping data for CSV files
by Anonymous Monk on Jun 21, 2003 at 12:29 UTC

    A follow-up:

    How do you handle newlines & spaces? I was thinking of placing each new entry on a new line in the file. Is the replace any type/amount of whitespace with a single space the best approach? thanks.

      There are a whole lot of modules to be found on CPAN if you search for "CSV". One of these is nearly guaranteed to fit your bill.

      Makeshifts last the longest.