in reply to html::parse inner body html

I'm not exactly sure what you want to end up with but this script will capture all the raw text within the body tags:
my $inner_body = ''; my $in_body = 0; my $Parser = HTML::Parser->new( api_version => 3, handlers => [ start => [\&start_handler, "tagname"], text => [\&text_handler, "text"], end => [\&end_handler, "tagname"], ], ); $Parser->parse($content); $Parser->eof(); print $inner_body; sub start_handler { my $tagname = shift; return unless ( $tagname eq 'body' ); $in_body = 1; } sub text_handler { my $text = shift; return unless $in_body; $inner_body .= $text; } sub end_handler { my $tagname = shift; return unless ( $tagname eq 'body' ); $in_body = 0; }

Replies are listed 'Best First'.
Re^2: html::parse inner body html
by SneakZa (Initiate) on May 29, 2013 at 22:53 UTC
    Hi I was looking to get the raw text with all the html tags excluding the body tags?? possible
      There's probably a simpler way to do that but this will do what you ask:
      my $Parser = HTML::Parser->new( api_version => 3, handlers => [ start => [\&start_handler, 'tagname,text'], text => [\&text_handler, "text"], end => [\&end_handler, "tagname,text"], ], ); $Parser->parse($content); $Parser->eof(); print $inner_body; sub start_handler { my $tagname = shift; if ( $tagname eq 'body' ) { $in_body = 1; return; } return unless $in_body; my $text = shift; $inner_body .= $text; } sub text_handler { my $text = shift; return unless $in_body; $inner_body .= $text; } sub end_handler { my $tagname = shift; if ( $tagname eq 'body' ) { $in_body = 0; return; } return unless $in_body; my $text = shift; $inner_body .= $text; }