in reply to Re^2: Help for "Cannot decode string with wide characters..." and CGI.pm
in thread Help for "Cannot decode string with wide characters..." and CGI.pm

Maybe something like this:
#!/usr/bin/perl -l BEGIN { $| = 1;} use strict; use warnings; use utf8; use Encode; use CGI qw/:standard -utf8/; use CGI::Carp qw/fatalsToBrowser set_message/; $CGI::PARAM_UTF8 = 1; BEGIN { sub handle_errors { my $msg = shift; print "<h1>There's a problem</h1>"; print "<p>Cannot decode string: $msg</p>"; } set_message(\&handle_errors); } my $q = CGI->new; binmode STDOUT, ":encoding(UTF-8)"; my $referer_url = "@{[ $q->url ];}"; print $referer_url;
  • Comment on Re^3: Help for "Cannot decode string with wide characters..." and CGI.pm
  • Download Code

Replies are listed 'Best First'.
Re^4: Help for "Cannot decode string with wide characters..." and CGI.pm
by PerlBroker (Acolyte) on Apr 08, 2012 at 17:26 UTC
    I get it, that strings shall not be double decoded. I got it to work without errors, but I do not get the UTF-8 displayed.
    print $q->header(-charset => 'utf-8'); my $val = $q->param('key'); print utf8::is_utf8($val); exit;
    This test gives me 1, which means, the value is UTF-8. But the value is not correctly displayed, just strange signs. Do you know what to do?

      But the value is not correctly displayed, just strange signs. Do you know what to do?

      verify everything :)

      all this encoding/decoding stuff just makes sure the bytes sent are proper, it doesn't ensure the HTML/HTTP is interpreted as utf

      You say some browsers are not displaying what you want? Start with the browsers (page info, error console, view source .... ) then move on to http://validator.w3.org/unicorn/

      For CGI.pm

      $ perl -le " use CGI -utf8; my $q = CGI->new; print $q->header, $q->st +art_html " Content-Type: text/html; charset=ISO-8859-1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" lang="en-US" xml:lang="en-U +S"> <head> <title>Untitled Document</title> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1 +" /> </head> <body> $ perl -le " use CGI -utf8; my $q = CGI->new; print $q->header(qw/ -ch +arset UTF-8 / ), $q->start_html " Content-Type: text/html; charset=UTF-8 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" lang="en-US" xml:lang="en-U +S"> <head> <title>Untitled Document</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> </head> <body>

      See also Tutorials: perlunitut: Unicode in Perl, perluniintro, Perl Unicode Essentials

Re^4: Help for "Cannot decode string with wide characters..." and CGI.pm
by PerlBroker (Acolyte) on Apr 08, 2012 at 16:57 UTC
    Thank you for this tip, I will use that to get better error messages. But yet it does not give me the string in UTF-8 and it shows the error again: The code is just as yours, with this in the end:
    print $q->header; print decode utf8 => $q->param('key');
    The "decode utf8" is not needed in my opinion, as the input is already UTF-8, and same error "wide character" is shown on above. But without "decode utf8 =>" I get no errors, but I get this characters: �� -- occurrence is strange, as the script just worked fine with Unicode. Any clues?
      The answer is simple: Do not try to decode an already-composed uri. That's why it works without 'decode utf8'. Does that help?