in reply to possible missunderstanding of package Encode
Your understanding of Encode is correct, your input of the original string is the issue. The expression 'Köln' will produce bytes in whatever encoding your script is in, not a decoded string. There are several ways to fix this:
1. Tell perl that all of your hard-coded strings are INPUT as utf8 (Note: I'm certain that your editor is set up for UTF-8 input from the output you received, but you should convince yourself of that too and then learn how to configure it)
use utf8; # All hard-coded strings will be assumed to be UTF-8 my $temp = encode( "iso-8859-1", 'Köln' ); ...
2. Tell perl that this one string was input as utf8 (again, it is UTF-8 because that is what your editor produces)
my $temp = encode( "iso-8859-1", decode("UTF-8", 'Köln') ); ...
The second case most closely resembles what happens when you process a file or command-line arguments:
# Files (change input encoding to match file encoding): open my $F, "<:encoding(UTF-8)", "myfile" or die "Error reading myfile +: $!"; my $line = <$F>; # $line contains a decoded string say encode( "iso-8859-1", $line ); # Command-Line args: my $arg = decode("UTF-8", $ARGV[0]); # Or, command-line args is an appropriate use of Encode::Locale use Encode::Locale; my $arg = decode("locale", $ARGV[0]);
Your output of "Köln(5)" tells us that your editor and your terminal are in UTF-8 encoding and $temp is double-encoded mojibake (just much less spectacularly obvious than usual mojibake).
Just keep in mind that once you decide to care about encoding: All input must be first decoded somehow (including strings input directly into program), then it must be encoded before output. If you find odd issues with encoding, ask where it was decoded and where it was encoded (and then ask yourself whether it was decoded or encoded twice).
Good Day,
Dean
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: possible missunderstanding of package Encode
by nikosv (Deacon) on Oct 20, 2015 at 11:35 UTC | |
Re^2: possible missunderstanding of package Encode
by toohoo (Beadle) on Oct 20, 2015 at 11:41 UTC |