in reply to Re: www:mechanize mangles unicode
in thread www:mechanize mangles unicode

Found the bug.

For starters, everything works fine if the server sends

<form method="POST" accept-charset="iso-8859-15">

HTML::Form (used by WWW::Mechanize) processes that attribute and generates the correct form data. The bug is that WWW::Mechanize doesn't inform HTML::Form of the page's charset, leaving HTML::Form with no idea what to do when accept-charset is missing. (It defaults to using UTF-8.)

Some may not consider this a bug since the spec simple recommends the behaviour, but it's what other browsers do.

Replies are listed 'Best First'.
Re^3: www:mechanize mangles unicode
by red0hat (Initiate) on Apr 28, 2010 at 23:12 UTC

    Wow. Thanks.

    Now, I'm searching for how to tell HTML::Form which character set to use from the client side

      HTML::Form->parser(..., charset => $encoding)
      but you can do it after the fact with
      $form->accept_charset($encoding)