in reply to Re^4: substr on UTF-8 strings
in thread substr on UTF-8 strings

First of all, who knows where my script, or parts of it, will land. Maybe on Windows

See here for a portable version.

That said, you'd want to switch your console to chcp 65001 and use UTF-8 if dealing with Unicode anyway.

What if your file mixes binary and UTF-8?

Binary files should be opened using :raw. This will override use open. Any portion that requires UTF-8 from decoded text can use Encode's encode or the builtin utf8::encode.