in reply to is_start_tag not on tokeparser simple

The HTML::TokeParser::Simple documentation states that the ->is_start_tag method should be called on the token, not on the parser object:

if ( $token->is_start_tag( qr/^h[123456]$/ ) ) { ... }

So your code should be using $tkn where it uses $htm5.

Replies are listed 'Best First'.
Re^2: is_start_tag not on tokeparser simple
by Anonymous Monk on Mar 22, 2006 at 17:28 UTC
    Thanks!

    It executes without error now but it doesn't find any heading tags. I kept all other $htm5 as the same after I changed all instances to $tkn and having it error out again.

    use warnings; use strict; use LWP::Simple; my $url = "my $url = "http://www.w3schools.com/html/html_primary.asp"; +"; # this has lots of <h#> tags my $src = get($url); my $headtags = ''; use HTML::TokeParser::Simple; my $htm5 = HTML::TokeParser::Simple->new(\$src); while ( my $tkn = $htm5->get_token ) { if ($tkn->is_start_tag( qr/^h[123456]$/ )) { next if (!$htm5->get_text); $headtags= $headtags . " " . $htm5->get_text; } } print "HEAD TAGS: $headtags\n\n\n\n"
    Thanks.

      Your first call to get_text (when you check if it has a value) is eating the text, then the next call doesn't get the value. You probably want something more along the lines of:

      my $text = $htm5->get_text; if ($text) { $headtags .= " $text"; }
        You were totally right! It works great now. Can you explain why when I test for the value it screws up the contents? I don't understand why it does that.