You could also give HTML::TokeParser [::Simple] a try.
In reply to Re: Parsing badly formed HTML by almut in thread Parsing badly formed HTML by SilasTheMonk