Re: Stripping of HTML content

HTML::TokeParser::Simple. The function you want is in the docs.

 my $html = join '', @webpage_lines;
 my $p = HTML::TokeParser::Simple->new( \$html );

 while ( my $token = $p->get_token ) {
     # This prints all text in an HTML doc (i.e., it strips the HTML)
     next if ! $token->is_text;
     print $token->return_text;
 }
[download]

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Comment on Re: Stripping of HTML content Download Code