in reply to Re: Confusing UTF-8 bug in CGI-script
in thread Confusing UTF-8 bug in CGI-script

The script also works when I comment out the lines use locale; and use open ':std' => ':encoding(UTF-8)'; (which are superfluous at best, IMHO).

«use locale;» is indeed superfluous since he doesn't do any operations that uses locales (cmp, lc, etc). It's not relevant to the OP's question since it doesn't affect encoding.

«use open ':std' => ':encoding(UTF-8)';» is not superfluous. Part of what it does is necessary, and the other part of what it does is wrong. Specifically,

BEGIN { # Wrong, and the cause of the OP's problem. See my reply to the OP. binmode(STDIN, ':encoding(UTF-8)'); # Necessary to encode the returned HTML. binmode(STDOUT, ':encoding(UTF-8)'); # Necessary to encode error messages for the log. binmode(STDERR, ':encoding(UTF-8)'); }

It could be replaced with the following or something equivalent, but it shouldn't be eliminated.

BEGIN { binmode(STDIN); # Form data binmode(STDOUT, ':encoding(UTF-8)'); # HTML binmode(STDERR, ':encoding(UTF-8)'); # Error messages }

Replies are listed 'Best First'.
Re^3: Confusing UTF-8 bug in CGI-script
by Anonyrnous Monk (Hermit) on Feb 01, 2011 at 18:55 UTC
    # Wrong, and the cause of the OP's problem. See my reply to the OP. binmode(STDIN, ':encoding(UTF-8)');

    That's what I would've thought, too, but interestingly, it doesn't do any harm in practice (I did try it), and

    # Necessary to encode the returned HTML. binmode(STDOUT, ':encoding(UTF-8)');

    only seems to be required with newer versions of CGI.pm (as I mentioned). Older versions apparently did the encoding themselves before printing to STDOUT (?)

      it doesn't do any harm in practice

      I don't know how you can say that after saying yourself that removing it also fixes the OP's problem.

      Update: Well, you said that removing use open fixes the issue, but I doubt you're claiming that binmoding output handles leads to a decoding error, so that leaves the binmoding of the input handle.

        How can you say that? You said yourself...

        If you re-read carefully what I said, you'll see that I said the script works both as is and when I remove those use statements (except for the special case I mentioned in the correction).