It appears $term is not actually UTF-8 encoded when this occurs.
No, it IS utf-8 encoded, perl just doesn't know that it is. And that can cause all kinds of crap. If you're reading $term from a handle (or reading any string from an encoded handle), you should set the handle's encoding using binmode. (i.e. binmode HANDLE,":utf8";) before reading from it. Or you can specify the :utf8 layer when you open() the file.
About the [UTF8 "ba\x{f1}o"] - note that \x{f1} does NOT specify an encoding. It's the literal notation for the 241st letter of the unicode set (which is also the 241st letter of the latin-1 set, i.e. "ñ" eq "\x{f1}") with the advantage that it's 7-bit ASCII so it will print correctly (almost) everywhere no matter if your output expects utf-8, latin-1 or latin-15 etc.
In reply to Re^7: Malformed UTF-8
by Joost
in thread Malformed UTF-8
by spiros
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |