in reply to Re: ':encoding(UTF-8)' corrupts strings from XML::LibXML which doesn't return unicode strings ?
in thread ':encoding(UTF-8)' corrupts strings from XML::LibXML which doesn't return unicode strings ?

It may be because the Microsoft website isn't indicating the document's UTF-8-ness in the HTTP headers

Hmm, I got fooled by firefox, it said utf-8 :)

Adding      encoding => 'UTF-8', to load_html also works

On a related note, encoding option doesn't work with parse_html_file/new, but load_html location will gladly accept filenames/filepaths

  • Comment on Re^2: ':encoding(UTF-8)' corrupts strings from XML::LibXML which doesn't return unicode strings ?
  • Download Code