in reply to Re^4: How to determine HTML encoding
in thread How to determine HTML encoding

How do you expect to contruct an HTTP response from just the body of the response? You need the headers too. By creating an H::R response object, you don't magically get the headers your function previously discarded.

I mentioned that an HTTP response is a possible source of the encoding. I used an example where the HTTP response is provided via an HTTP::Reponse object. The H::R object reflects the HTTP response; it's not a decoding tool.

You're not starting from an HTTP response. You're starting from an HTML file. I told how to handle that situation too.

Either fix your function to return an HTTP response and not just the body of an HTTP response, or treat the value returned as a file.

Replies are listed 'Best First'.
Re^6: How to determine HTML encoding
by slugger415 (Monk) on Jul 01, 2010 at 16:48 UTC

    You can probably tell I don't understand HTTP responses very well; I was considering this comment of yours:

    That said, you don't need to do ANY of this. You just use HTTP::Response's ->decoded_content method and it will decode the content for you.

    I was trying to figure out how to do that, but wasn't able to figure it out from reading the documentation. In any event, back to Plan A:

    print join " ", LWP::UserAgent->new->get("$url")->content_type, $/;

    Oddly enough that works, but only returns this:

    content type: text/html

    The charset is not included, although when I use the same call on the MS or Google sites as in your example, it does.

    Confused? yes. :-(

    Thanks for your patience, and apologies for not getting it.

      The web server doesn't necessarily know the encoding. You'll have to peek inside the file for the meta tag.