On my system, the active CP seems to change between 437 and 1252 at times
1252 is your ANSI CP. 437 is your OEM CP. Don't ask me what's used for what, but it's easy to find by testing. (I think you'll find the ACP used for systems calls, whereas the OEMCP will be used for console IO.)
Encode's encode and decode functions can handle Windows code pages. Just prepend "cp" to the number (e.g. cp1252, cp437).
| [reply] [d/l] [select] |
"why doesn't Perl handle this conversion itself by determining what command output encoding is in effect and converting as needed?"
How could it even reliably determine whether the output was text? A command launched via qx could easily be outputting binary data (e.g. an image; some compressed data; etc). "Converting encoding" on binary data is very likely to corrupt it.
perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
| [reply] [d/l] |
How about something like qx/command/t to indicate text output?
| [reply] |
Try posting code. The console is traditionally legacy 1 char per byte code page. Type "chcp" at your console to see what it is. Programs normally print legacy CP data to console, not utf8. It would look like gibberish if it was utf8 printed to console. Also you can run into truncation/substitution problems, where your non-latin letters being real 1 char 1 byte "?"s. Technically a program can print binary to the console, often done by unix-ish tools. You could also try and mark the STDIN/STDOUT as utf8, i'm not sure how successful that is on Perl Windows (worst case, console spits out legacy cp, perl coverts all the invalid utf8 character sequences to a filler characters). | [reply] |
By "at your console" I assume you mean "in a cmd.exe window." Right after a reboot, chcp reports the active console code page as 437. Later on, some undetermined process changes it. After that, chcp reports the code page as 1252. My script does set STDOUT to UTF8 with binmode and the output, redirected to a file, is correct. I tried setting STDIN to UTF8, but this seems to have no effect on the behavior of qx.
| [reply] |