That's a good point. My little regexes there don't convert every single entity, but it strips EVERY tag, and converts the <'s, >'s, quotes, and ampersands. Not much else would be left behind, honestly.
Regardless of that fact, bradcathey, seems to have a very nice solution which is much faster than regex anyway.
In reply to Re^3: Getting the text of the html document
by dyer85
in thread Getting the text of the html page
by agynr
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |