in reply to HTML Matching

The FAQ answer (How do I remove HTML from a string?) suggests that HTML::Parse is the most correct answer.

It also notes that HTML comments, tags that continue over line breaks, and angle brackets within quoted attributes can break a simpler parser. For example:

<!--- <img src="foo.jpg" alt="proof that 4 > 2"> -->
If you're dealing with machine-generated HTML and can guarantee a certain degree of cleanliness, your solution will work. Otherwise, you really need a parser. And a parser will be slower, having to keep track of opening brackets and quotes.

It's like the old saying, "Only perl can parse Perl."