in reply to Unicode problem with some letters
Perl can store Unicode strings internally in Latin-1 if no character in the string has a codepoint above 255.
That's what happens here, and it's why you don't get the "wide character" warning -- none of your characters is "wider" than 255.
Note that you can still treat $str (or $_) as a character string, and print it if you set up an :encoding(UTF-8) IO layer on STDOUT:
$ echo -e "\xC3\xA0" | perl -CS -pne 'BEGIN{binmode STDIN, ":utf8"}; $ +_= uc'
Update: on my perl (5.14.1) it seems that $_ is always stored in UTF8 internally, but still the point applies that no codepoint is > 255 in that string, so none is "wide".
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Unicode problem with some letters
by OlegG (Monk) on Aug 21, 2011 at 18:25 UTC | |
by moritz (Cardinal) on Aug 21, 2011 at 19:54 UTC | |
by OlegG (Monk) on Aug 22, 2011 at 15:03 UTC |