Maybe something like this:
#!/usr/bin/perl -l
BEGIN { $| = 1;}
use strict;
use warnings;
use utf8;
use Encode;
use CGI qw/:standard -utf8/;
use CGI::Carp qw/fatalsToBrowser set_message/;
$CGI::PARAM_UTF8 = 1;
BEGIN {
sub handle_errors {
my $msg = shift;
print "<h1>There's a problem</h1>";
print "<p>Cannot decode string: $msg</p>";
}
set_message(\&handle_errors);
}
my $q = CGI->new;
binmode STDOUT, ":encoding(UTF-8)";
my $referer_url = "@{[ $q->url ];}";
print $referer_url;
| [reply] [d/l] |
I get it, that strings shall not be double decoded.
I got it to work without errors, but I do not get the UTF-8 displayed.
print $q->header(-charset => 'utf-8');
my $val = $q->param('key');
print utf8::is_utf8($val); exit;
This test gives me 1, which means, the value is UTF-8.
But the value is not correctly displayed, just strange signs.
Do you know what to do?
| [reply] [d/l] |
But the value is not correctly displayed, just strange signs. Do you know what to do?
verify everything :)
all this encoding/decoding stuff just makes sure the bytes sent are proper, it doesn't ensure the HTML/HTTP is interpreted as utf
You say some browsers are not displaying what you want? Start with the browsers (page info, error console, view source .... ) then move on to http://validator.w3.org/unicorn/
For CGI.pm
$ perl -le " use CGI -utf8; my $q = CGI->new; print $q->header, $q->st
+art_html "
Content-Type: text/html; charset=ISO-8859-1
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US" xml:lang="en-U
+S">
<head>
<title>Untitled Document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1
+" />
</head>
<body>
$ perl -le " use CGI -utf8; my $q = CGI->new; print $q->header(qw/ -ch
+arset UTF-8 / ), $q->start_html "
Content-Type: text/html; charset=UTF-8
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US" xml:lang="en-U
+S">
<head>
<title>Untitled Document</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<body>
See also Tutorials: perlunitut: Unicode in Perl, perluniintro, Perl Unicode Essentials
| [reply] [d/l] |
Thank you for this tip, I will use that to get better error messages. But yet it does not give me the string in UTF-8 and it shows the error again:
The code is just as yours, with this in the end:
print $q->header;
print decode utf8 => $q->param('key');
The "decode utf8" is not needed in my opinion, as the input is already UTF-8, and same error "wide character" is shown on above.
But without "decode utf8 =>" I get no errors, but I get this characters: �� -- occurrence is strange, as the script just worked fine with Unicode.
Any clues?
| [reply] [d/l] |
The answer is simple: Do not try to decode an already-composed
uri. That's why it works without 'decode utf8'. Does that help?
| [reply] |