in reply to Stripping HTML tags with Regular Expressions.

regexp seems the way to go at first until a day later you're still handling all the complex cases you forgot about.

Better to use something like HTML::Parser that does it right. HTML::Parser used to have an example that stripped all tags.

I do use regexp's for cases that are predictable -- e.g., we have some text that's known to contain only specific markup that's easily handled with regexp's. Not for general HTML tag stripping though.

  • Comment on Re: Stripping HTML tags with Regular Expressions.