The bytes you are skipping form character U+FEFF, the Byte-order mark. Use UCS-2 instead of UCS-2le and it will skip the character for you.
The wide character warning is issued because you outputting decoded characters without encoding them (loosely speaking). Fix:
# Encode output. # Use the encoding that's appropriate for you. binmode STDOUT, ':encoding(UTF-8)'; my $lines; { # Decode input. open my $log_fh, "<:encoding(UCS-2)", $file or die($!); local $/ = undef; $lines = <LOGFILE>; } print "...\n", $lines, "...\n";
On unix, you can do use open ':std', ':locale'; to set the "correct" encoding for STDOUT, but it doesn't work on Windows :(
If I did not skip the first two byes, [...] "$lines = <LOGFILE>" would only capture a few characters out of a 1088 character file.
You are mistaken.
If Perl should have handled the UCS-2LE file without needing to include the encoding or the skipping of bytes
Perl has no way of knowing the encoding of a file, or even if it's a text file for that matter.
If the IDrive log files might be a non-standard or corrupted UCS-2LE
Why do you ask that?
There are some byte combination that aren't allowed in UCS-2*. Encountering them is fatal.
$ perl -e'open $fh, "<:encoding(UCS-2le)", \"\x00\xD8"; <$fh>' UCS-2LE:no surrogates allowed d800 at -e line 1.
In reply to Re: Problems Handling UCS-2LE
by ikegami
in thread Problems Handling UCS-2LE
by Ecurb
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |