Re: Regex: Strip <script> tags?

If your only worry were attributes following the start of the tag ... as, for example,
<script src=....
you could simply remove the ">" at the end of the first "<script>" in the (cargo-culted) regex, thusly
<script .*?<\/script>... which will catch anything inside script tags (unless -- illogically, they're miss written by your users with nested <script ...> tags. (Update: In fact, this is a faq.)

However, as skx has already pointed out, evil is not restricted to items labeled "<script...>

Bottom line: You should probably consider/study security issues (suggestion: start with some examples of why to use -t and move on to more generic considerations) AND should improve your regex-fu before borrowing code.

You've been here long enough to have seen discussions of the un-wisdom of writing your own .html parsers, and might wish to review some of those (Cliff notes-style summary: you might screw up by rolling your own) and also read these old-but-still-good nodes: Re: How to remove HTML tags from text (by skx, with a more expansive version of his comment above); How do I test for potential security problems?; and Re: Remove HTML tags from document, including Jured's links to asking questions.

Comment on Re: Regex: Strip <script> tags? Select or Download Code