I am removing HTML tags by hand . . . though it's 90% a set of really neat regexes that I found online (and yes, it does work with comments, etc.). The only problem is that it leaves tags with non-visible text in the middle of an opening and closing tag (such as STYLE, OPTION, SCRIPT, and TEXTAREA). I made the following regex:
But it doesn't work; I think it's because scripts have a < in it when it's not really done. The problem is in the middle:
. . . I'm sure you know what I meant. But how do I do this the right way in a regex? (That is, something to the exent of "continue until you see a </\1".) Thanks!