in reply to Perl detect utf8, iso-8859-1 encoding

I am afraid I do not have the luxury to discard all non-utf8 input, but I can simplify the code:

if the input is not detected as utf8, just treat it as iso-8859-1

use Text::Unaccent; use Encode::Detect::Detector; # my $author = "Sch%F6%E5ttl"; # my $author = "Sch%C3%A9ttl"; # my $author = "Sch%C3%B6ttl"; # my $author = "Sch%F6%F6ttl"; # my $author = "Sch%F6 %F4ttl"; my $author = "teoria elasticit%E0"; $author =~ s/%([a-zA-Z0-9][a-zA-Z0-9])/pack('C',hex($1))/eg; my $encoding = Encode::Detect::Detector::detect($author); if($encoding !~ m#utf-8#i){ $encoding = "iso-8859-1"; } if($encoding){ $author = unac_string($encoding, $author); print "after unac: $author<br>\n"; }

Seems like it's working better, any potential problem?