in reply to Re: Peeling Data with Reserved Characters and Long Lines
in thread Peeling Data with Reserved Characters and Long Lines
I don't yet know if the specific data that has to be matched loses info if I convert to Roman/Latin
You can tell Perl the file is encoded in UTF-16, so it will decode it properly. This way you won't lose anything. E.g.
my $infile = shift @ARGV; open my $fh, "<:encoding(UTF-16)", $infile or die $!; while (<$fh>) { ...
(In case the file has no BOM, you might need to use encoding(UTF-16LE) instead of encoding(UTF-16).)
|
|---|