in reply to Re^2: Confusing UTF-8 bug in CGI-script
in thread Confusing UTF-8 bug in CGI-script

What you mean: isn't text.

The text you typed into your browser is transformed by it as follows:

  1. It is encoded using the proper character encoding.
  2. Some of it is encoded using percent encoding.
  3. The resulting string is joined to others to form a application/x-www-urlencoded document.

That leaves you something that's no longer your text. The proper inverse of that is:

  1. Split the form data into its components.
  2. Remove any percent encoding.
  3. Remove the character encoding.

You're adding an additional step:

  1. Remove the character encoding. (XXX)
  2. Split the form data into its components.
  3. Remove any percent encoding.
  4. Remove the character encoding.

The fourth step notices something is odd and throws an error.

And how then transfer the text and make perl to understand it is UTF-8 encoded?

That's what the «-utf8» in «use CGI qw(:all -utf8);» does. "This makes CGI.pm treat all parameters as UTF-8 strings" by passing them to decode.

Replies are listed 'Best First'.
Re^4: Confusing UTF-8 bug in CGI-script
by wanradt (Scribe) on Feb 01, 2011 at 21:13 UTC
    That's what the «-utf8» in «use CGI qw(:all -utf8);» does.

    And without "-utf8" i can't use CGI properly because then are UTF-8 encoded GET-paramaters treated wrong? I mean, STDIN i can fix with "use open ..."

    I'm gonna finally get to somewhere... Thank you!

    Nõnda, WK