http://qs1969.pair.com?node_id=222958


in reply to Extract text from HTML

I used to like reinventing the wheel every time... I used to do something like this:
# THIS IS BAD s/(\s|\&nbsp;)+/ /g; s/<(BR|P)>/\n/ig; s/<.+?>//g;
Now I just :
use HTML::TokeParser;