Here's an annoying one. The quirky but uber-fast Text::CSV_XS module can handle newlines embedded in fields if you turn on binary mode (binary => 1). MySQL's SELECT INTO OUTFILE and LOAD DATA INFILE can also handle embedded newlines. However, MySQL wants to prefix them with the escape character (a '\' in my case), and Text::CSV_XS chokes on that.

After fiddling with the options for both, I can't find a way to make them agree. It looks like my only option might be to s{\\n}{\n}g on the way out of MySQL and then reverse that on the way back in after processing with Text::CSV_XS. That is likely to hurt performance, since I am using the getline() and print() methods in Text::CSV_XS and I would have to switch to reading and printing the lines myself. (Reading the lines and dealing with embedded newlines myself? Ick!)

If anyone has any suggestions, I would appreciate them. Here's a (reduced) sample of the code I'm using:

use Text::CSV_XS; use IO::Handle; use Carp qw(croak); my $file = $ARGV[0]; my $parser = Text::CSV_XS->new({ binary => 1, always_quote => 1, escape_char => '\\', #' help emacs }); # open input and output temp files open(my $in_fh, '<', $file) or croak("Unable to open input file $!"); my $line_num = 0; # Using while(1) here and exiting via last() so we can avoid testing t +o see # if it's the last line (with eof) more than once per line. while (not eof($in_fh)) { $line_num++; # parse the row and handle any errors my $in_values = $parser->getline($in_fh); croak("Unable to parse line $line_num of file $file: " . $parser-> +error_input()) if !$in_values; } close $in_fh or croak("Error closing file $file: $!");

And the SQL looks like this:

SELECT blah blah blah INTO OUTFILE ? FIELDS TERMINATED BY ',' ESCAPED BY '\\' ENCLOSED BY '"' LINES TERMINATED BY '\n'

In reply to Text::CSV_XS and MySQL newline handling by perrin

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.