in reply to Re: html::parse inner body html
in thread html::parse inner body html

Hi I was looking to get the raw text with all the html tags excluding the body tags?? possible

Replies are listed 'Best First'.
Re^3: html::parse inner body html
by tangent (Parson) on May 30, 2013 at 00:17 UTC
    There's probably a simpler way to do that but this will do what you ask:
    my $Parser = HTML::Parser->new( api_version => 3, handlers => [ start => [\&start_handler, 'tagname,text'], text => [\&text_handler, "text"], end => [\&end_handler, "tagname,text"], ], ); $Parser->parse($content); $Parser->eof(); print $inner_body; sub start_handler { my $tagname = shift; if ( $tagname eq 'body' ) { $in_body = 1; return; } return unless $in_body; my $text = shift; $inner_body .= $text; } sub text_handler { my $text = shift; return unless $in_body; $inner_body .= $text; } sub end_handler { my $tagname = shift; if ( $tagname eq 'body' ) { $in_body = 0; return; } return unless $in_body; my $text = shift; $inner_body .= $text; }