note
farang
<p>If you have a file with a known encoding, for instance koi8-r,
just use
<c>
open( $fh, '< :encoding(koi8-r)', "in_file");
</c>
and it should work fine under utf8::all.
<blockquote>
And i want the scripts to: a) warn me when i open a file that contains non-utf characters rather than die at it;
</blockquote>
I don't think it's possible to do it "when the file is opened"
because the error arises when some non-Unicode utf8 sequence is read
into Perl's internals. One way to do it is to use <tt>eval</tt>
while reading the file line-by-line and trap the error. Here is
some code which does that, trying first in utf8 and if that fails
to be valid, warns and retries with koi8-r.
<c>
use strict;
use warnings;
use utf8::all;
open(my $fh, '<', "in_file") or die "cannot open in_file: $!";
eval { process_file_by_line() };
if ( $@ =~ /does not map to Unicode/ ) {
warn $@;
print "...trying encoding koi8-r instead of utf8\n\n";
close $fh;
open( $fh, '< :encoding(koi8-r)', "in_file") or die "cannot open in_file: $!";
process_file_by_line();
}
elsif ( $@ ne '' ) {
die $@; # bail out on other eval errors
}
sub process_file_by_line {
while ( <$fh> ) {
print;
# whatever else...
}
}
</c>
</p>
1051913
1051913