No such thing as a small change | |
PerlMonks |
Re: The Queensr˙che Situationby ikegami (Patriarch) |
on Oct 19, 2014 at 22:49 UTC ( [id://1104351]=note: print w/replies, xml ) | Need Help?? |
Your terminal expects UTF-8. You printed chr(0xFF), which is not the UTF-8 encoding of "˙". You can encode it yourself, or you ask Perl to do it using the following:
It's not UTF-8 (which would be C3 BF). is_utf8($string) does not indicate whether $string contains UTF-8. It's not UTF-16 (which would be 00 FF or FF 00 depending on endianness). Decoding string (as use utf8; does for literals) results in Unicode Code Points ("˙" is U+00FF).
That is the UTF-8 encoding of "Queensr˙che", though it is incorrect to say that is_utf8 signifies that Encode agrees.
Tools that work with text (such as regular expressions and Text::Unaccent::PurePerl) usually expect the text to be provided as strings of Unicode Code Points, not encoded using UTF-8.
Aformentioned will also tell Perl to decode bytes read from file handles.
In Section
Seekers of Perl Wisdom
|
|