in reply to substr on utf8-strings

From your examples, this isn't a problem with substr but with making sure the utf8 flag is on the data. I suspect some XS code (either the getline or the ...) not setting the flag correctly. (What class is that getline in? What class is $io?) You can put Encode::is_utf8($scalar) checks through your code to figure out where its being lost, and if necessary, do Encode::_utf_on.

Update: the suggestion of using _utf_on is a temporary workaround and is not a substitue for reporting a bug to the author if a module is not correctly handling UTF8 input.