in reply to Parsing badly formed HTML

If memory serves me correctly, the ability to handle poorly formed markup is one of the features of HTML::Parser and its children, courtesy of Gisle Aas. I have used this family on a few occasions to extract information from some pretty ghastly markup and have never had any problems.

I never even bother trying to use XML::Parser unless I know that the markup is going to be well-formed. (I lie - sometimes I actually use XML::Parser just to see if code is well-formed.)