If you switch to HTML::TokeParser::Simple, I think you'll be happy with how much clearer the logic is.
use strict; use HTML::TokeParser::Simple; use LWP::Simple; use URI; my $url = 'http://www.reuters.com/newsEarlierArticles.jhtml?type=busin +essNews'; my $stream = HTML::TokeParser::Simple->new(\get($url)) || die "Couldn't read $url: $!"; while(my $token = $stream->get_token) { next unless $token->is_start_tag('td') and ($token->return_attr('class') || '') eq 'earlyHeadline'; my $next = $stream->get_token; if ($next->is_start_tag('a')) { print URI->new_abs($next->return_attr('href'), $url), "\n"; } }
Cheers,
Ovid
New address of my CGI Course.
In reply to Re: HTML::TokeParser help - parsing headlines
by Ovid
in thread HTML::TokeParser help - parsing headlines
by perleager
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |