The code below just reads and writes a CSV file. If the file 'test.csv', contains only ASCII characters, the output file 'new.csv' will not have any quotes around strings with embedded spaces. This is expected behavior as the code is using quote_space => 0 BUT if the string has non-ASCII characters with embedded spaces, the string in the output file has quotes around the string. It appears this is a bug in the csv_xs module as I expect to get same results when using quote_space => 0 regardless of the type of characters in the string. I really don't want the quotes so my question is how to get this code to work with ASCII and non-ASCII (UTF8 data)?

I want the code to work that is 'no quotes' around text that has embedded spaces. This code works for ASCII data (sample below) but does not work when the text has UTF8 characters (sample below). Not working means the string with the UTF8 characters get double quotes around it if such a string has embedded spaces. This is the wrong behavior when using quote_space => 0

this is the first test.csv. It has Japanese characters (UTF8)

hi,bye,test is great,test what is your name,is,これ 試験 t +his is a test,test

The file 'new.csv' created from the script is now shown

hi,bye,test is great,test what is your name,is,"これ 試験  +this is a test",test

The problem is the double quotes around the string with Japanese characters. The expectation is that all strings regards of what type of characters are used will not have double quotes around the string if such string has embedded characters when using quote_space => 0

use Text::CSV_XS; use encoding 'utf8'; my @rows; my $csv = Text::CSV_XS->new ({ quote_space => 0, binary => 1 }) or die "Cannot use CSV: ".Text::CSV_XS->error_diag (); open my $fh, "<:encoding(utf8)", "test.csv" or die "test.csv: $!"; while (my $row = $csv->getline ($fh)) { push @rows, $row; } $csv->eof or $csv->error_diag (); close $fh; $csv->eol ("\r\n"); open $fh, ">:encoding(utf8)", "new.csv" or die "new.csv: $!"; $csv->print ($fh, $_) for @rows; close $fh or die "new.csv: $!";

In reply to CSV_XS and UTF8 strings by beerman

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.