in reply to Determine encoding of STDOUT

It really is much easier if you simply arrange that all consumers of your output expect UTF‑8, not some legacy encoding.

For example, on a Mac or any other modern Unix system, I always set my window encoding to use UTF‑8. Setting it to use something like crufty old MacRoman would just set me up for a world of pain.

No thanks!

Replies are listed 'Best First'.
Re^2: Determine encoding of STDOUT
by Dirk80 (Pilgrim) on May 04, 2011 at 13:59 UTC

    I think you are right. The best way is to write in the help of the script that its STDOUT is encoded as UTF8.

    I looked around how to do change the encoding of "cmd.exe".

    I could do it temporarily as follows:

    • change font to "Lucida Console"
    • enter chcp 65001 to change the encoding to UTF8

    But the change of the new codepage 65001 is NOT stored permanently. If I open a new "cmd.exe" its codepage is 437 again.

    A bit offtopic because this question is windows specific and not perl: Do you know how to change the codepage permanently to 65001 in Windows XP?

    Would you recommend to write the following code at the beginning of the script:

    `chcp 65001`;

    So I would assure that the encoding is UTF8. Of course I should find out the active codepage (e.g. 437) at the beginning of the script and then restore it at the end (perhaps in an END block). And I should also do it dependent on the OS. This for example would be a windows specific solution.

      You might change the shortcut to Command Prompt to run the chcp automatically. For example my shortcut contains %windir%\system32\cmd.exe /F:ON /k doskey /macrofile="%USERPROFILE%\doskey.mac"

      The /k means "run this and stay open".

      You can also set a hotkey for the program and start it by , say, CTRL+SHIFT+`.

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

Re^2: Determine encoding of STDOUT
by ikegami (Patriarch) on May 04, 2011 at 17:00 UTC
    The UTF-8 support for the Windows console is full of problems, although I wonder if some go away if a cygwin bash shell is used.
      If I want a portable program, I’ll generate Unicode output.

      If I want a non-portable program, I’ll generate output in some non-portable, legacy vendor encoding.

      I never ever do the second.

      It’s a shame that Microsoft is still lagging behind on proper Unicode support, but that is hardly Perl’s fault. Perl makes it easy to write portable programs, and bending over backwards to accomodate Microsoft-only idio(t)syncrasies seems like a self-limiting and very niche environment.

        It’s a shame that Microsoft is still lagging behind on proper Unicode support,

        It's great outside the console, but I agree about the console. Microsoft's lack of attention to the shell and related features is disappointing to say the least.

        bending over backwards to accomodate Microsoft-only idio(t)syncrasies

        Baseless bashing. Unix has locales too. It's not Microsoft's fault that the following doesn't work on Windows:

        use open IO => ':locale';
        Microsoft-only ... a self-limiting and very niche environment.

        85% of installations world wide is a strange, some might say blinkered, definition of "niche".