Re: Encode: unable to change encoding of strings

good morning Hue-Bond,

*hint* from_to does decode and encode inplace.

perhaps you should use decode first to convert:

perl -MEncode -e '$_="\xc3\x81mbito"; print $_; $_ = decode "utf-8", $
+_; print encode "iso-8859-1", $_'
perl -MEncode -e '$_="\xc1mbito"; print $_; $_ = decode "iso-8859-1", 
+$_; print encode "utf-8", $_'
[download]

work like a charm ;-)

saludos,
--shmem

_($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                              /\_¯/(q    /
----------------------------  \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}

Comment on Re: Encode: unable to change encoding of strings Download Code

Replies are listed 'Best First'.
Re^2: Encode: unable to change encoding of strings by Hue-Bond (Priest) on Jul 09, 2006 at 08:44 UTC
from_to does decode and encode inplace Yes, that's what the documentation says. I'm using it accordingly, so no surprises here. `perl -MEncode -e '$_="\xc1mbito"; print $_; $_ = decode "iso-8859-1", $_; print encode "utf-8", $_'` So you are using `decode` to translate the input from ISO-8859-1 to "Perl's internal form", whatever it is, and then encode to print it out in the desired encoding. That's two calls, and I thing this "problem" could be solved with just one, after all it's a simple matter of changing the encoding of a string! The gotcha may be in that I'm cheating by assuming that "Perl's internal form" is UTF-8 (I think it is but I shouldn't be assuming it anyway). So what I was trying was to decode the ISO-8859-1 input into UTF-8 with a call to `decode` and then use it without further modification (this is the third example in the OP; the others are just for illustrating the issue). Your snippet makes sense and agrees with what I've read recently somewhere, that says that data should be decoded when acquired, then used within the program and finally encoded again when giving it back to the outside world. -- David Serrano	[reply] [d/l]

Replies are listed 'Best First'.

Re^2: Encode: unable to change encoding of strings
by Hue-Bond (Priest) on Jul 09, 2006 at 08:44 UTC

from_to does decode and encode inplace

Yes, that's what the documentation says. I'm using it accordingly, so no surprises here.

perl -MEncode -e '$_="\xc1mbito"; print $_; $_ = decode "iso-8859-1", $_; print encode "utf-8", $_'

So you are using decode to translate the input from ISO-8859-1 to "Perl's internal form", whatever it is, and then encode to print it out in the desired encoding. That's two calls, and I thing this "problem" could be solved with just one, after all it's a simple matter of changing the encoding of a string! The gotcha may be in that I'm cheating by assuming that "Perl's internal form" is UTF-8 (I think it is but I shouldn't be assuming it anyway). So what I was trying was to decode the ISO-8859-1 input into UTF-8 with a call to decode and then use it without further modification (this is the third example in the OP; the others are just for illustrating the issue).

Your snippet makes sense and agrees with what I've read recently somewhere, that says that data should be decoded when acquired, then used within the program and finally encoded again when giving it back to the outside world.

--
David Serrano

[reply]
[d/l]