From my brief research ( microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/chcp.mspx?mfr=true ), it looks like trying to
set the console encoding with
system (chcp XXX) seems like a bad idea. It doesn't even seem possible to set it to UTF-8, and setting it to anything else will probably make it impossible to use the very characters that are causing the problems in the first place, because there is no single OEM codepage that covers all the characters people may use... and I'd be reluctant to mess with the settings of other people's computers anyway (does this setting only affect the current session or is it permanent?)
So reading the console's encoding (which depends on OS localization) and then converting the incoming text in Perl accordingly sounds better to me... but I'm just taking stabs in the dark. I can't follow half of the posts here, but I can't see working code in any of them so far.
BTW as I said before, this is just one half of the issue.
Even if I were to go
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
open(FILE, "<:encoding(UTF-8)", "c:\\folder\\í.txt") or print "Oops, c
+an't open file: $!";
<STDIN>;
..and save this in UTF-8, it would still fail to open the file. It seems pretty clear that I'd need to use one of the modules to ever be able to open a file with a non-ASCII name, and I can't really make sense of the documentation of the modules.
So the step-by-step seems to be:
1) read what the console's OEM encoding is
2) convert filepath received via STDIN from OEM to UTF-8
3) open the file using one of the Unicode modules
...or maybe I'm completely wrong.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.