But as hinted by massa above, any scalar with the utf8 flag turned on will cause the script to die with a run-time error:
because you cannot "decode()" a string into perl-internal utf8 if it is already flagged as being perl-internal utf8.Wide character in subroutine entry at /.../Encode.pm line ...
There does seem to be some suggestion of discrepancy between the Encode man page and the behavior of "eq" and "ne"; the man page says:
...to convert ISO−8859−1 data to a string in Perl’s internal format:$string = decode("iso−8859−1", $octets);
CAVEAT: When you run "$string = decode("utf8", $octets)", then $string may not be equal to $octets. Though they both contain the same data, the utf8 flag for $string is on unless $octets entirely consists of ASCII data (or EBCDIC on EBCDIC machines).
(Update: thanks to almut for catching/explaining how I misread this point.)
But the following script (when run with perl 5.8.8 on darwin) shows that the flag setting seems to have no effect on "eq" for the characters in question (the "high table" portion of 8859-1) -- every output line says "(flag diff...) decoding ... makes no difference":
So I wonder whether there are any perl versions or installations where the caveat actually applies to "eq" and "ne", or whether there is some other comparison operator on my version/installation that would catch the difference in the flag setting.#!/usr/bin/perl use Encode qw/encode decode is_utf8/; for my $scalar ( map { encode( 'iso-8859-1', chr( $_ )) } 0xa0 .. 0xff + ) { printf( "decoding %s makes %s difference\n", $scalar, ( test( $scalar ) ? "no" : "some sort of" )); } sub test { my $x = shift; my $y = Encode::decode('iso-8859-1', $x); print "(flag diff...) " if ( is_utf8( $x ) ne is_utf8( $y )); if ($x eq $y) { return 1; } else { return 0; } }
In reply to Re: question about Encode::decode('iso-8859-1', ...)
by graff
in thread question about Encode::decode('iso-8859-1', ...)
by perl5ever
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |