jeanluca has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks

I've a small script that replaces characters inside a file, like:
open IN,"$_" or die "Could not read file\n" ; local $/ ; my $file = <IN> ; # read file close IN ; $file =~ s/$ARGV[1]/$ARGV[2]/g ;
For some files the following exception was generated:
Malformed UTF-8 character (unexpected non-continuation byte 0x20, imme +diately after start byte 0xe9) in substitution iterator at ./myscript +.pl line 23
Any suggestions what went wrong here ?

Thanks in advance
Luca

Replies are listed 'Best First'.
Re: Malformed UTF-8 character
by Realbot (Scribe) on Mar 14, 2006 at 14:14 UTC
    You probably have some strings with high-bits set in them. Find out what their encoding is and then apply
    use Encode; my $perl_string = decode(<encoding>, $original_string);
    where <encoding> can be "iso-8859-1", "utf8" of one of the other encodings available...see Encode help.
    Then you can apply s//

    Regards.
      OK, thanks. Do you know some ulrs where I can read more about this (iso-88..., utf8) so I will get a better understing of what it means ?

      Luca