in reply to Re: Peeling Data with Reserved Characters and Long Lines
in thread Peeling Data with Reserved Characters and Long Lines

I don't yet know if the specific data that has to be matched loses info if I convert to Roman/Latin

You can tell Perl the file is encoded in UTF-16, so it will decode it properly.  This way you won't lose anything.  E.g.

my $infile = shift @ARGV; open my $fh, "<:encoding(UTF-16)", $infile or die $!; while (<$fh>) { ...

(In case the file has no BOM, you might need to use encoding(UTF-16LE) instead of encoding(UTF-16).)