I've become a pretty big fan of
XML::LibXML, but I haven't tried it on not-well-formed-XML content before. With the
recover option, how much of the funniness in typical HTML can it handle (missing end tags, unquoted attribute values, singleton tags like
<br>)?