in reply to UTF8 issues
decode works, except it takes a scalar
my @encoded_hira = <INP>; my @decoded_hira = map { decode('UTF-8', $_) } @encoded_hira; my @decoded_romaji = kana2romaji(@decoded_hira); my @encoded_romaji = map { encode('UTF-8', $_) } @decoded_romaji; print OUTP @encoded_romaji;
binmode works, except it takes a file handle.
binmode INP, ':encoding(UTF-8)'; binmode OUTP, ':encoding(UTF-8)';
But the simplest way is to pass that directive to open.
open(INP, '<:encoding(UTF-8)', $ARGV[0]) open(OUTP, '>:encoding(UTF-8)', $ARGV[1])
use utf8; tells Perl the Perl source is UTF-8. Not relevant here.
_utf8_on is an unsafe version of decode. Like decode, it would have worked if you had used it properly.
(Note that the module is wrong to check for the UTF8 flag. Informally, this is called "The Unicode Bug". It's trying to detect if you made an error, but it can incorrectly flag valid inputs as errors.)
|
|---|