You can use HTML::TreeBuilder to parse the HTML, then output it in XHTML, using the as_XML method, which works most of the time. It may not help with the encoding problem though, especially if the HTML lies about its encoding. XML::Twig can do this for you BTW, so in fact you may not need to use tidy at all, just install HTML::TreeBuilder and then use parsefile_html to parse the HTML.
Also HTML::Tidy uses a fork of tidy, and may be worth a try.
In reply to Re^3: Encoding/decoding question
by mirod
in thread Encoding/decoding question
by slugger415
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |