in reply to Regex: Strip <script> tags?
However, as skx has already pointed out, evil is not restricted to items labeled "<script...>
Bottom line: You should probably consider/study security issues (suggestion: start with some examples of why to use -t and move on to more generic considerations) AND should improve your regex-fu before borrowing code.
You've been here long enough to have seen discussions of the un-wisdom of writing your own .html parsers, and might wish to review some of those (Cliff notes-style summary: you might screw up by rolling your own) and also read these old-but-still-good nodes: Re: How to remove HTML tags from text (by skx, with a more expansive version of his comment above); How do I test for potential security problems?; and Re: Remove HTML tags from document, including Jured's links to asking questions.
|
|---|