I have resolved the issue. The answer was already supplied in an earlier posting from ikegami. I did not understand how to call the HTML::Form method accept_charset

Below is the modified code that now handles accented input correctly

Really appreciate all the wisdom lurking in this forum

use strict; use warnings; use WWW::Mechanize; # Create a new browser my $browser = WWW::Mechanize->new(autocheck => 1 ); # Tell it to get the main page $browser->get("http://193.1.97.44/focloir/"); # Okay, fill in the form with the first word to look up $browser->form_number(1); # Select first as active form # # This is the patch to specify input character set $browser->form_number(2)->accept_charset ("iso-8859-15"); $browser->field("WORD", "acht"); # Next word in dict is achtú # Get a consecutive sequence of words, one word per web request for (my $i=1; $i<=2; $i++) { $browser->dump_forms(); # i=1 WORD parameter is acht # Hex: [61 63 68 74] # i=2 WORD parameter is achtú # Hex: [61 63 68 74 FA] $browser->click(); # Make the Web request print $browser->content; # i=1 Word found # i=2 Message: could not find - # acht&#195;&#186; # Hex: [61 63 68 74 c3 ba] # which is achtú in UTF-8 sleep (1); # Just in case we get into # trouble with the web server # # Pick the second form. It should have the next head word # already filled in # NOTE: application code does not access any parameters on # this form $browser->form_number(2); # Select second form as # active form # # This is the patch to specify input character set $browser->form_number(2)->accept_charset ("iso-8859-15"); + }

In reply to Re^3: www:mechanize mangles unicode by Desmond Walsh
in thread www:mechanize mangles unicode by red0hat

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.