comment on

If by "a simple scalar (unblessed, etc.)" you mean "any numeric value, or any string value that does not have the utf8 flag set", then no, there is no value of $x for which test($x) returns 0.

But as hinted by massa above, any scalar with the utf8 flag turned on will cause the script to die with a run-time error:

Wide character in subroutine entry at /.../Encode.pm line ...
[download]

because you cannot "decode()" a string into perl-internal utf8 if it is already flagged as being perl-internal utf8.

~~There does seem to be some suggestion of discrepancy between the Encode man page and the behavior of "eq" and "ne";~~ the man page says:

...to convert ISO−8859−1 data to a string in Perl’s internal format:
$string = decode("iso−8859−1", $octets);
CAVEAT: When you run "$string = decode("utf8", $octets)", then $string may not be equal to $octets. Though they both contain the same data, the utf8 flag for $string is on unless $octets entirely consists of ASCII data (or EBCDIC on EBCDIC machines).

(Update: thanks to almut for catching/explaining how I misread this point.)

But the following script (when run with perl 5.8.8 on darwin) shows that the flag setting seems to have no effect on "eq" for the characters in question (the "high table" portion of 8859-1) -- every output line says "(flag diff...) decoding ... makes no difference":

#!/usr/bin/perl

use Encode qw/encode decode is_utf8/;

for my $scalar ( map { encode( 'iso-8859-1', chr( $_ )) } 0xa0 .. 0xff
+ ) {
    printf( "decoding %s makes %s difference\n", $scalar,
            ( test( $scalar ) ? "no" : "some sort of" ));
}

sub test {
  my $x = shift;
  my $y = Encode::decode('iso-8859-1', $x);
  print "(flag diff...) " if ( is_utf8( $x ) ne is_utf8( $y ));
  if ($x eq $y) {
    return 1;
  } else {
    return 0;
  }
}
[download]

So I wonder whether there are any perl versions or installations where the caveat actually applies to "eq" and "ne", or whether there is some other comparison operator on my version/installation that would catch the difference in the flag setting.

In reply to Re: question about Encode::decode('iso-8859-1', ...) by graff
in thread question about Encode::decode('iso-8859-1', ...) by perl5ever

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.