Here's a nice little function that does it.
(FYI - the faq noted by
Juerd is obsolete. Not only does it not give an actual solution, but it recommends HTML::Parse, which is now deprecated.)
use HTML::Parser;
sub extract_html_text
{
my $html = shift;
my $text = '';
HTML::Parser->new( api_version => 3, text_h => [ sub { $text .= "@
+_"; }, "dtext" ] )->parse( $html )->eof;
$text
}
UPDATE: Here's another (imho, nicer) little function that does it:
use HTML::TreeBuilder;
sub extract_html_text
{
HTML::TreeBuilder->new_from_content($_[0])->as_text
}
jdporter
...porque es dificil estar guapo y blanco.