From reading the documentation it would appear that indeed LWP will return you a proper Perl string with properly decoded text if the correct Content-type header is present in the code. If the pages you are downloading are not in UTF-8 and they do not contain a Content-type header specifying what encoding they are in, then
will not work and you will need to use some method to guess the correct encoding. But if those pages do have proper headers and/or you can otherwise assume they are in UTF-8, then yes that should work.$response->decoded_content(default_charset => 'utf-8')
There is a one-line "any-encoding - to - utf-8" conversion in Perl, but it requires you to know what encoding you're starting with. The function to use is the decode function in Encode.
In reply to Re: How do I convert any given html to utf-8?
by Errto
in thread How do I convert any given html to utf-8?
by isync
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |