in reply to Parsing web data by tag... help?

Write a "peek" function that will print the next $count items in the stream and uses the unget_token method to restore your parser to its original state. Here's my take on it. It should be easy enough to munge this for your needs.

#!/usr/bin/perl use strict; use warnings; use HTML::TokeParser::Simple 3.13; my $url = 'http://www.perlmonks.com/'; my $parser = HTML::TokeParser::Simple->new( url => $url ); $parser->get_tag("ul"); print peek($parser); sub peek { my $parser = shift; my $count = shift || 5; my $items = 0; my $html = ''; my @tokens; while ( ( my $token = $parser->get_token ) && $items++ < $count ) +{ $html .= $token->as_is; push @tokens, $token; } $parser->unget_token(@tokens); return $html; }

You know what? I like this so much I should probably add it to HTML::TokeParser::Simple.

Cheers,
Ovid

New address of my CGI Course.

Replies are listed 'Best First'.
Re^2: Parsing web data by tag... help?
by SpacemanSpiff (Sexton) on Sep 06, 2005 at 21:08 UTC
    Just wanted to thank everyone for the prompt replies. Even though I've read it in the manual, practical examples and explanations are helping put it all together. For whatever reason, what wasn't working before now seems to when I try it again. Must be some divine power from this place... Anyway, thanks for the patience. I've picked up the llama and camel (the whole bookshelf CD kit at that) and hope to keep the dumb questions to a minimum on my journey to become a Perl monk(ey).