comment on

package HTML::TokeParser::Listerine;
use strict;
use warnings;
use base 'HTML::TokeParser';

sub get_tag {
    my $self = shift;
    if (wantarray) {

        # build and return a list

        my @tags;
        while ( my $tag = $self->SUPER::get_tag(@_) ) { # delegate to 
+superclass
            push @tags, $tag;
        }
        return @tags;
    }
    else { return $self->SUPER::get_tag(@_) }
}

sub get_token {
    my $self = shift;
    if (wantarray) {

        # build and return a list

        my @tokens;
        while ( my $token = $self->SUPER::get_token(@_) )
        {    # delegate to superclass
            push @tokens, $token;
        }
        return @tokens;
    }
    else { return $self->SUPER::get_token(@_) }
}

1;

__END__

=pod

=head1 NAME

HTML::TokeParser::Listerine - Context-sensitive HTML token parsing

=head1 SYNOPSIS

 use HTML::TokeParser::Listerine;
 my $html = q {

 <html>
  <body>
   <!-- Match my comment, and include it  -->
   <!-- in the output of get_token        -->
   <a href="http://www.foo.com">Bar</a><br />
   <a href="http://www.bar.com">Foo</a><br />
  </body>
 </html>

 };

 my $p = HTML::TokeParser::Listerine->new(\$html);

 # magically parse html with map rather than tedious while!
 # you could also use get_token to do this
 my @links    = map { $_->[1]->{href} } $p->get_tag('a');

 print "Links are: ", join("\n", @links), "\n";

=head1 DESCRIPTION

HTML::TokeParser::Listerine overrides the C<get_tag> and C<get_token> 
+methods
of HTML::TokeParser to make them DWIM in a list context, for example o
+ne
provided by the C<grep> and C<map> operators. This allows you to do te
+rse
complex filtering, rather than having to enter a big while loop everyt
+ime you
want to parse HTML, which isn't easy on the eye.

Obviously, this is a slower approach than doing it with a while loop, 
+as
internally it uses the same mechanism. It simply saves you typing, and
+ that can
be a lot more convenient than you think.

=head1 METHODS

The only difference to HTML::TokeParser is that if you use the methods
C<get_tag> and C<get_token> in list context they return a list of all 
+the tags
and tokens, respectively. Using it in scalar context should behave the
+ same as
vanilla TokeParser does.

=head1 AUTHOR

Amoe.

=head1 REQUIREMENTS

HTML::TokeParser and everything else that depends on.

=head1 SEE ALSO

HTML::TokeParser and HTML::PullParser manpages.

=cut
[download]

In reply to HTML::TokeParser::Listerine by Amoe

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.