in reply to Re^2: Interventionist Unicode Behaviors
in thread Interventionist Unicode Behaviors
It's trying and failing to convert Unicode code point 0x263a to Latin-1.
No, it is not.
You asked for the code points E2, 98 and BA, and you got them. You then manually messed around with the UTF8 flag. Because of your environment, Perl encoded the three-character string as latin-1, so the bytes were E2 98 BA, and so you are lucky. Then you set the UTF8 flag on, and finally you have that code point 263a, but you didn't get it the way you should have. When you print this string, however, there's no conversion going on AT ALL, because you never specified what to convert TO!
Perl has no choice but to dump its internal representation to STDOUT, but is friendly enough to warn you that this output may not be what you want, because it doesn't know what you want.
We see the warning because it's impossible to translate a code point that high to Latin 1.
No, we see the warning because you're printing something that has the UTF8 flag set (and thus with certainty is a text string), to a filehandle that doesn't have an encoding attached to it.
I don't want to spend all my time explaining the bottomless intricacies of Unicode handling in Perl to people.
Neither do we, but apparently you INSIST that you use the internals directly instead of the way things were intended, so we have to explain to you these bottomless inticacies of Unicode handling in Perl's internals if you're ever to understand what the heck your broken code really does.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Interventionist Unicode Behaviors
by creamygoodness (Curate) on Sep 08, 2006 at 12:29 UTC | |
by Juerd (Abbot) on Sep 11, 2006 at 09:38 UTC | |
by creamygoodness (Curate) on Sep 11, 2006 at 20:37 UTC | |
by Juerd (Abbot) on Sep 11, 2006 at 22:15 UTC |