http://qs1969.pair.com?node_id=222737


in reply to Extract text from HTML

Here's a nice little function that does it.

(FYI - the faq noted by Juerd is obsolete. Not only does it not give an actual solution, but it recommends HTML::Parse, which is now deprecated.)
use HTML::Parser; sub extract_html_text { my $html = shift; my $text = ''; HTML::Parser->new( api_version => 3, text_h => [ sub { $text .= "@ +_"; }, "dtext" ] )->parse( $html )->eof; $text }

UPDATE: Here's another (imho, nicer) little function that does it:
use HTML::TreeBuilder; sub extract_html_text { HTML::TreeBuilder->new_from_content($_[0])->as_text }

jdporter
...porque es dificil estar guapo y blanco.