Errto has asked for the wisdom of the Perl Monks concerning the following question:
Folks,
I have a large collection of text files which all should be in UTF-8, but some of them are not. I need to determine which ones those are. I am currently trying something like this:
If I pass $filename as a file that is not in UTF-8 (i.e. it's in cp1252 which is my other contender) I get a warning on STDERR but the eval does not die as I would like it to. I would like to know, how can I make it do that?eval { open my $file, '<:utf8', $filename or die $!; local $/; <$file>; }; die "$filename is invalid utf8: $@\n" if $@;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: detect incorrect character encoding
by shigetsu (Hermit) on Jan 03, 2007 at 00:05 UTC | |
|
Re: detect incorrect character encoding
by graff (Chancellor) on Jan 03, 2007 at 01:49 UTC | |
|
Re: detect incorrect character encoding
by bsdz (Friar) on Jan 03, 2007 at 00:26 UTC | |
|
Re: detect incorrect character encoding
by almut (Canon) on Jan 03, 2007 at 05:24 UTC | |
by graff (Chancellor) on Jan 03, 2007 at 07:52 UTC | |
by almut (Canon) on Jan 03, 2007 at 16:05 UTC | |
|
Re: detect incorrect character encoding
by cub.uanic (Acolyte) on Jan 04, 2007 at 05:42 UTC |