in reply to Re^2: WWW::Mechanize always use utf8
in thread WWW::Mechanize always use utf8

First, the CGI code is not portable as you do not specify the encoding of the source code. Either use pure ASCII () or add an encoding statement ("use utf8;" if your source code is encoded in UTF-8).

Secondly, URL encoding is a historical problem. Originally URL were defined as ASCII only. But some people started to encode non ASCII (8 bits) characters. Some using iso-8859-1. Some with UTF-8. Some with other encodings.
Then the IETF normalized the URL encoding for HTTP as UTF-8.
For backward compatibilty, the User-Agent are using the encoding of the document of the form source to decide which encoding to use in GET URLs. You can change this behavior in MSIE in the advanced settings.

So WWW::Mechanize is working as expected. Change your CGI output to UTF-8 and WWW::Mechanize will probably send URL encoded as UTF-8.

Replies are listed 'Best First'.
Re^4: WWW::Mechanize always use utf8
by ikegami (Patriarch) on Mar 23, 2009 at 13:20 UTC

    First, the CGI code is not portable as you do not specify the encoding of the source code.

    How is it not portable?

    The script will be portable no matter what encoding he specifies as long as the encoding in the following two lines match:

    binmode STDOUT, ':encoding(iso-8859-1)'; ... print "Content-type: text/html; charset=iso-8859-1\n\n";

    The only question is whether the browser can encode using iso-8859-1 or not. I'd be very surprise to meet one that couldn't.