Your point is taken, though - I don't say why I still don't think that the ammended solution is robust. Parsing HTML with a series of regexes is slow and difficult. style tags don't necessarily have endtags, for example: They could simply have a link to a .js file. Then, much later in the HTML document, if there was a closing script tag for another block, it would swallow and delete the enclosed valid content.
For performance, HTML::Stripper is an XS module, so it would be much, much faster than the multi-pass regex approach.
In reply to Re^4: Stripping HTML tags
by fishbot_v2
in thread Stripping HTML tags
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |