in reply to Re^3: Windows-1252 characters from \x{0080} thru \x{009f} (source-code encoding)
in thread Windows-1252 characters from \x{0080} thru \x{009f}
From perlunicode…
"use encoding" needed to upgrade non-Latin-1 byte strings
By default, there is a fundamental asymmetry in Perl's Unicode model: implicit upgrading from byte strings to Unicode strings assumes that they were encoded in ISO 8859-1 (Latin-1), but Unicode strings are downgraded with UTF-8 encoding. This happens because the first 256 codepoints in Unicode happens to agree with Latin-1.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: Windows-1252 characters from \x{0080} thru \x{009f} (source-code encoding)
by ikegami (Patriarch) on Apr 24, 2012 at 02:16 UTC |