in reply to Dynamically cleaning up HTML fragments

I have not used it extensively, but another module that looks really neat for parsing and "tidying" HTML is Marpa-HTML. Their html_fmt demo does handling of missing start and end tags, and the dist's documentation talks about being able to selectively eliminate certain types of tag.
  • Comment on Re: Dynamically cleaning up HTML fragments