I would suggest you take a look at HTML::TreeBuilder. It does a fine job of fixing broken HTML, and it is quite easy to remove tags that you do not want to allow (using find()).
If the HTML is just a snippet (not a complete document with html and body tags), it will add the necessary tags, but one can always use disembowel() to get rid of that. :-)