I'm trying to get the Windows Console (cmd.exe) to play ball with Unicode, but the behaviour differs between output and redirection of output

I can make the output of the Console look nice when using:

binmode( STDOUT, ':unix:utf8');

However when redirecting output all line endings will then be LF instead of CRLF, which some Windows software doesn't like. To have CRLF line endings and UTF-8 I can use:

binmode( STDOUT, ':utf8');

or simply invoke perl with -CS options or set the PERL_UNICODE environment variable to 7.

However that mangles console output badly by repeating the last character

D:\>chcp Active code page: 65001 D:\>perl -E"autoflush STDOUT; print \"Hello, World\xC2\xB2.\";print ': +'" Hello, WorldČ..: D:\>perl -CS -E"autoflush STDOUT; print \"Hello, World\N{SUPERSCRIPT T +WO}.\";print ':'" Hello, WorldČ..:

The number of repetitions corresponds to the difference in character length vs byte length of the UTF-8 string.

Redirecting the above output to a file and typing the file in the same console produces the correct output:

D:\>perl -E"autoflush STDOUT; print \"Hello, World\xC2\xB2.\";print ': +'" > string.txt D:\>type string.txt Hello, WorldČ.:

So the wisdom I seek is: Does a workaround exist that will allow correct Unicode output on both console and redirected output?

BTW: Microsoft's latest version of notepad handles LF line endings. As more and more software handle LF line endings without issues the proper solution long term could be to go with LF endings.

However that would raise another issue as Perl's unicode implementation for Windows would have to be changed in order for the PERL_UNICODE environment variable and perlrun options to ignore CRLF line endings on output.

The behaviour described above is the same for ActivePerl and Strawberry Perl.


In reply to Windows console mangles UTF8 output by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.