in reply to WWW::Mechanize & encoding

If you mean entities like ゛ and &#x309B, those are not UTF-8 specific, but they give you the codepoint.

You can convert that to "normal" characters (and not HTML entitites) with HTML::Entities. You can encode the resulting string with Encode::encode in any encoding you like (and which supports those characters).