Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to get the Windows Console (cmd.exe) to play ball with Unicode, but the behaviour differs between output and redirection of output

I can make the output of the Console look nice when using:

binmode( STDOUT, ':unix:utf8');

However when redirecting output all line endings will then be LF instead of CRLF, which some Windows software doesn't like. To have CRLF line endings and UTF-8 I can use:

binmode( STDOUT, ':utf8');

or simply invoke perl with -CS options or set the PERL_UNICODE environment variable to 7.

However that mangles console output badly by repeating the last character

D:\>chcp Active code page: 65001 D:\>perl -E"autoflush STDOUT; print \"Hello, World\xC2\xB2.\";print ': +'" Hello, WorldČ..: D:\>perl -CS -E"autoflush STDOUT; print \"Hello, World\N{SUPERSCRIPT T +WO}.\";print ':'" Hello, WorldČ..:

The number of repetitions corresponds to the difference in character length vs byte length of the UTF-8 string.

Redirecting the above output to a file and typing the file in the same console produces the correct output:

D:\>perl -E"autoflush STDOUT; print \"Hello, World\xC2\xB2.\";print ': +'" > string.txt D:\>type string.txt Hello, WorldČ.:

So the wisdom I seek is: Does a workaround exist that will allow correct Unicode output on both console and redirected output?

BTW: Microsoft's latest version of notepad handles LF line endings. As more and more software handle LF line endings without issues the proper solution long term could be to go with LF endings.

However that would raise another issue as Perl's unicode implementation for Windows would have to be changed in order for the PERL_UNICODE environment variable and perlrun options to ignore CRLF line endings on output.

The behaviour described above is the same for ActivePerl and Strawberry Perl.

Replies are listed 'Best First'.
Re: Windows console mangles UTF8 output
by ikegami (Patriarch) on Jun 30, 2019 at 03:14 UTC

    It's a bug in the Windows console. The explanation is in the first reply to Perl RT#121783. It was fixed in Windows 8.

    Update: Fixed link.

      "Bug #121783 for Net-BrowserID-Verify: Test host has expired certificate"???

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

      Thanks!

      Works properly from Windows Server 2016 :)

      Fails in Windows Server 2012 and earlier

Re: Windows console mangles UTF8 output
by nikosv (Deacon) on Jul 01, 2019 at 16:41 UTC