Assuming the web server on the other end is properly configured, it will tell you which encoding to use in the HTTP headers. This may look like Content-Type: text/html; charset=utf-8 or some other variation thereof. If the charset value is specified, you can pass it on to Encode, and convert to the proper encoding.

Alternatively, you can sometimes find this indication in the <head> section of the html document. In that case, you should look for a line that looks like this: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />.

See Encode for details on converting between character sets. See HTTP::Response and HTTP::Header for access to the http headers.


In reply to Re: Encoding Hell by rhesa
in thread Encoding Hell by kettle

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.