Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
This is the output when the substitution line (line 30) is commented:use Encode; use utf8; #use open IO => ':locale'; #my $s = "El supersónico de los Indi "; my $s1 = "El supero de los Indi "; #$s1 = decode_utf8( $s); print "\n\nStart string: $s1\n\n"; my $s2 = &fix_special_characters($s1); print"\nEnd string: $s2\n\n"; sub fix_special_characters { my($string) = @_; open(C,"<:utf8","chars.txt"); my @c = <C>; for(my $i=0; $i < @c; $i++) { my ($special,$htmlchar) = split(/\t/,$c[$i]); print "$special : $htmlchar"; $string =~ s/$special/$htmlchar/ig; ## this is generating +the error message } return $string; }
However, when I uncode that substitution line I get the following error messages for every line in the char file:Start string: El supero de los Indi Á : Á á : á É : É é : é Í : Í í : í Ñ : Ñ ñ : ñ Ó : Ó ó : ó Ú : Ú ú : ú Ü : Ü ü : ü ¿ : ¿ ¡ : ¡End string: El supero de los Indi
I have spent hours trying different methods to make this work with no luck. Any monks out there that can help with this? Thank youMalformed UTF-8 character (unexpected non-continuation byte 0x20, imme +diately after start byte 0xc1) in regexp compilation at sp.pl line 30 +, <C> line 16. Malformed UTF-8 character (unexpected non-continuation byte 0x20, imme +diately after start byte 0xc1) in regexp compilation at sp.pl line 30 +, <C> line 16.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: utf-8 problem
by kennethk (Abbot) on Jan 29, 2009 at 22:07 UTC | |
|
Re: utf-8 problem
by eff_i_g (Curate) on Jan 29, 2009 at 22:09 UTC | |
by Anonymous Monk on Jan 29, 2009 at 22:16 UTC | |
by eff_i_g (Curate) on Jan 29, 2009 at 22:21 UTC | |
by kennethk (Abbot) on Jan 29, 2009 at 22:29 UTC | |
by almut (Canon) on Jan 29, 2009 at 22:36 UTC | |
|
Re: utf-8 problem
by Marshall (Canon) on Jan 30, 2009 at 15:04 UTC | |
by ikegami (Patriarch) on Jan 30, 2009 at 15:26 UTC | |
by Marshall (Canon) on Jan 30, 2009 at 17:49 UTC | |
|
Re: utf-8 problem
by bichonfrise74 (Vicar) on Jan 31, 2009 at 17:39 UTC |