Hi, I found a little more time to look at your problem. I was recently dealing with a related issue.

There's a well-known bug in MS Excel's handling of character encodings in CSV files. You can read all about in on StackOverflow and elsewhere. The short answer is that you need to encode your CSV file not as UTF-8 but as UTF-16LE in order for Excel to recognize it properly.

Here is a cleaned-up version of your script. For input I used an .xlsx file containing the data you showed below. This creates a CSV file encoded in UTF-16, so you won't be able to read it as a text file in your terminal if the terminal is set to UTF-8. But Excel will read the file and display the contents correctly.

use strict; use warnings; use autodie; use Spreadsheet::Read qw/ ReadData rows /; use Text::CSV; my $outputFile = '1198074.csv'; my $inputFile = '1198074.xlsx'; my $book = ReadData( $inputFile ); my @writeRows; for my $row ( rows( $book->[1] ) ) { my ( $id, $last, $first, $pref ) = @{ $row }; push @writeRows, [ $pref || $first, $last, 'Static', 'Text' ]; } open my $fh, '>:encoding(UTF-16LE)', $outputFile; my $csv = Text::CSV->new( { binary => 1, eol => "\n", auto_diag => 1 } + ); $csv->print( $fh, $_ ) for @writeRows; close $fh; __END__

Hope this helps!


The way forward always starts with a minimal test.

In reply to Re: XLSX to CSV with high ASCII characters by 1nickt
in thread XLSX to CSV with high ASCII characters by apu

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.