in reply to Re^5: Confusing UTF-8 bug in CGI-script
in thread Confusing UTF-8 bug in CGI-script

ok, I follow.

Note that even if it seems to work, that doesn't make what you say correct. Depending on how the client encodes the request, the OP's code will work. That doesn't make it right.

I suspect one of two reasons for the differences:

Update: Cleaned up. Replaced first paragraph. (It was "Gotcha.", which is ambiguous.)

Replies are listed 'Best First'.
Re^7: Confusing UTF-8 bug in CGI-script
by Anonyrnous Monk (Hermit) on Feb 01, 2011 at 21:59 UTC
    I suspect one of two reasons for the differences: ...

    Neither of those is the case.

    I've verified (by sending the request through a proxy) that the content is sent UTF-8 encoded (i.e. no %-encoding).  And my Encode::decode also behaves normally (i.e. it would die with "Cannot decode string with wide characters").  This is, however, irrelevant, because CGI.pm has code to prevent double-decoding:

    sub _decode_utf8 { my ($self, $val) = @_; if (Encode::is_utf8($val)) { return $val; } else { return Encode::decode(utf8 => $val); } }

    This sufficiently explains the behavior I observed and reported (for the input side).

      Well, it's pretty damn close. I said

      Maybe your version of decode

      I should have said

      Maybe your version of the decoder