That's not strange. You're seeing Unicode codepoints, which for the characters in question happen to be identical to their ISO-8859-1 encodings. Add "\N{EURO SIGN}" to the string and you get "\x{20ac}": That's again the codepoint and no UTF-8 encoding.
"Everything is UTF-8" is one of the most frequent false assumptions I encounter when dealing with non-ASCII characters.
In reply to Re^6: UTF8 versus \w in pattern matching (basic test)
by haj
in thread UTF8 versus \w in pattern matching
by mldvx4
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |