package HTML::TokeParser::Listerine; use strict; use warnings; use base 'HTML::TokeParser'; sub get_tag { my $self = shift; if (wantarray) { # build and return a list my @tags; while ( my $tag = $self->SUPER::get_tag(@_) ) { # delegate to +superclass push @tags, $tag; } return @tags; } else { return $self->SUPER::get_tag(@_) } } sub get_token { my $self = shift; if (wantarray) { # build and return a list my @tokens; while ( my $token = $self->SUPER::get_token(@_) ) { # delegate to superclass push @tokens, $token; } return @tokens; } else { return $self->SUPER::get_token(@_) } } 1; __END__ =pod =head1 NAME HTML::TokeParser::Listerine - Context-sensitive HTML token parsing =head1 SYNOPSIS use HTML::TokeParser::Listerine; my $html = q { <html> <body> <!-- Match my comment, and include it --> <!-- in the output of get_token --> <a href="http://www.foo.com">Bar</a><br /> <a href="http://www.bar.com">Foo</a><br /> </body> </html> }; my $p = HTML::TokeParser::Listerine->new(\$html); # magically parse html with map rather than tedious while! # you could also use get_token to do this my @links = map { $_->[1]->{href} } $p->get_tag('a'); print "Links are: ", join("\n", @links), "\n"; =head1 DESCRIPTION HTML::TokeParser::Listerine overrides the C<get_tag> and C<get_token> +methods of HTML::TokeParser to make them DWIM in a list context, for example o +ne provided by the C<grep> and C<map> operators. This allows you to do te +rse complex filtering, rather than having to enter a big while loop everyt +ime you want to parse HTML, which isn't easy on the eye. Obviously, this is a slower approach than doing it with a while loop, +as internally it uses the same mechanism. It simply saves you typing, and + that can be a lot more convenient than you think. =head1 METHODS The only difference to HTML::TokeParser is that if you use the methods C<get_tag> and C<get_token> in list context they return a list of all +the tags and tokens, respectively. Using it in scalar context should behave the + same as vanilla TokeParser does. =head1 AUTHOR Amoe. =head1 REQUIREMENTS HTML::TokeParser and everything else that depends on. =head1 SEE ALSO HTML::TokeParser and HTML::PullParser manpages. =cut

In reply to HTML::TokeParser::Listerine by Amoe

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.