Malformed UTF-8 character

jeanluca has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks

I've a small script that replaces characters inside a file, like:

open IN,"$_" or die "Could not read file\n" ;

local $/ ;
my $file = <IN> ; # read file
close IN ;

$file =~ s/$ARGV[1]/$ARGV[2]/g ;
[download]

For some files the following exception was generated:

Malformed UTF-8 character (unexpected non-continuation byte 0x20, imme
+diately after start byte 0xe9) in substitution iterator at ./myscript
+.pl line 23
[download]

Any suggestions what went wrong here ?

Thanks in advance
Luca

Comment on Malformed UTF-8 character Select or Download Code

Replies are listed 'Best First'.
Re: Malformed UTF-8 character by Realbot (Scribe) on Mar 14, 2006 at 14:14 UTC
You probably have some strings with high-bits set in them. Find out what their encoding is and then apply `use Encode; my $perl_string = decode(<encoding>, $original_string);` [download] where <encoding> can be "iso-8859-1", "utf8" of one of the other encodings available...see Encode help. Then you can apply s// Regards.	[reply] [d/l]
Re^2: Malformed UTF-8 character by jeanluca (Deacon) on Mar 14, 2006 at 14:58 UTC
OK, thanks. Do you know some ulrs where I can read more about this (iso-88..., utf8) so I will get a better understing of what it means ? Luca	[reply]
Re^3: Malformed UTF-8 character by mirod (Canon) on Mar 14, 2006 at 15:51 UTC
Would The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) be ok? Oh, and of course, in perl: `perldoc perlunicode`.	[reply]
Re^3: Malformed UTF-8 character by Realbot (Scribe) on Mar 14, 2006 at 17:32 UTC
Sure, in random order: http://intertwingly.net/stories/2004/04/14/i18n.html Survival guide to i18n http://jerakeen.org/slush/talk-perl-loves-utf8/ Perl Loves UTF-8 http://www.ahinea.com/en/tech/perl-unicode-struggle.html Unicode-processing issues in Perl and how to cope with it http://rf.net/~james/perli18n.html Perl, Unicode and i18N FAQ (rather old, but still valuable) Cheers.	[reply]