Re^3: CGI hidden params vs. character encoding

it worked. How strange...

I found it strange too. I just clued in what the error is.

First of all,

binmode STDOUT, ':utf-8';
[download]

is a no-op, since there's no "utf-8" layer.

>perl -le"print binmode(STDERR, ':utf8')?1:0"
1

>perl -le"print binmode(STDERR, ':utf-8')?1:0"
0

>perl -le"print binmode(STDERR, ':encoding(utf8)')?1:0"
1

>perl -le"print binmode(STDERR, ':encoding(utf-8)')?1:0"
1
[download]

If we do it properly (:encoding(utf-8)) we end up with your orignal problem.

Your problem is that you are double-encoding! You're telling CGI to encode your data using UTF8 (-charset => 'utf-8') and then you encode it again using binmode STDOUT, ":utf8";.

The solution is to get rid of binmode completely and only use CGI's methods to output.

Comment on Re^3: CGI hidden params vs. character encoding Select or Download Code

Replies are listed 'Best First'.
Re^4: CGI hidden params vs. character encoding by graff (Chancellor) on May 28, 2008 at 00:41 UTC
Your problem is that you are double-encoding! You're telling CGI to encode your data using UTF8 (-charset => 'utf-8') and then you encode it again using binmode STDOUT, ":utf8";. But... But... Then why did the double-encoding show up only in that one place?? If the behavior were consistent throughout, I would understand, but I still can't figure out how I got the particular behavior that I did. The solution is to get rid of binmode completely and only use CGI's methods to output. I'm not sure about that. If I comment out the "binmode STDOUT..." in the OP code (having fixed all other encoding specs to "UTF-8" as described), I get "Wide character in print" warnings showing up in the error log. Also, I don't think I should have to rely entirely on CGI methods for printing content.	[reply]
Re^5: CGI hidden params vs. character encoding by ikegami (Patriarch) on May 28, 2008 at 01:24 UTC
But... But... Then why did the double-encoding show up only in that one place?? Because the rest were ASCII characters. `use Encode qw( encode ); $str = '<p>foo</p>'; for (1..5) { print("$str\n"); $str = encode('UTF-8', $str); }` [download] `<p>foo</p> <p>foo</p> <p>foo</p> <p>foo</p> <p>foo</p>` [download] I'm not sure about that. If I comment out the "binmode STDOUT..." in the OP code (having fixed all other encoding specs to "UTF-8" as described), I get "Wide character in print" warnings showing up in the error log ARGH! CGI doesn't seem to be encoding. What's `-charset` for, then!? I need to look into this more. By the way, `<p/>` makes no sense. `<p/>text<p/>text` means `<p></p>text<p></p>text` but you want `<p>text</p><p>text</p>` is what you want.	[reply] [d/l] [select]