in reply to Re^2: Unicode problem with some letters
in thread Unicode problem with some letters
When you don't specify :utf8 or :encoding(UTF-8), Perl assumes Latin-1 (aka ISO-8859-1):
$ echo -e "\xC3\xA0" | perl -pne 'BEGIN{binmode STDIN, ":utf8"}'|hexdu +mp -C e0
Latin-1 0xE0 encodes the codepoint U+00E0 LATIN SMALL LETTER A WITH GRAVE, which is the character that the UTF-8 string C3 A0 encodes.
Since your terminal is configured to receive UTF-8 output (I suppose), it doesn't know what to do with perl's non-UTF-8 output, and shows the general "I'm confused" replacement character.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Unicode problem with some letters
by OlegG (Monk) on Aug 22, 2011 at 15:03 UTC |