in reply to Character encoding in console in Windows

binmode STDIN, ':encoding(UTF-8)';

First, you're assuming the terminal uses UTF-8. That's unlikely on Windows and not necessarily true on unix. chcp will tell you the current code page on a Windows system (usally cp1252).

Second, Perl's file operators expect file names to be string of bytes. If you decode them, you'll need to re-encode them.

Finally, If you have to deal with files whose name contain characters that don't exist in your code page, you'll encounter a third problem. See Re: stat() and utf8 filenames on Win32 fails for me, why? for a bit on that.

Replies are listed 'Best First'.
Re^2: Character encoding in console in Windows
by elef (Friar) on Sep 14, 2010 at 09:49 UTC
    First, you're assuming the terminal uses UTF-8.

    Well, binmode STDIN was a first shot in the dark to see if it fixes or changes anything, not a carefully analysed solution. I know it doesn't work.

    Anyway, it looks there is no one-liner to solve this so I'm calling it a day. (For example, your writeup says Decode the file name from whatever encoding your source uses... well, I have a horrible feeling that the encoding from CMD.exe will differ based on the localization of the OS so there is no solution that will work for every Windows computer.) Even if there is a way to do this, it sounds like it would take more research than it's worth.
    It's quite odd though that there is no simple, tried and tested universal solution for what I'd call pretty basic functionality. It just goes to show what an inexcusable, horrid mess encoding is in general.

    Even if we were to forget about opening files based on input from the console and just hardcode the path into the perl script, it looks like one would need to use Win32API::File and at least createFile and OsFHandleOpen, or God knows what else. Half of your post on this went right over my head, to be honest.