in reply to Re^2: Getting the text of the html document
in thread Getting the text of the html page
That's a good point. My little regexes there don't convert every single entity, but it strips EVERY tag, and converts the <'s, >'s, quotes, and ampersands. Not much else would be left behind, honestly.
Regardless of that fact, bradcathey, seems to have a very nice solution which is much faster than regex anyway.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Getting the text of the html document
by davorg (Chancellor) on Jul 19, 2005 at 09:36 UTC |