only have one at my fingertips: http://tinyurl/nvzfar
It seems to be pretty straightforward... and the parse completes... both with HTML::TokeParser and WWW:Mechanize. The failure occurs some fixed amount of execution after the parse is finished. Even if just stacking meaningless prints, it fails... (only tested in Mechanize)
Let me say this again: after this loop:
foreach my $link (@links) {
print "LINK: " . $link->url() . "\n" if ($DEBUG>=1);
push(@anchors,$link->url());
}
my $goodAnchors = 0;
print " @ 1ANCLOOP\n" if ($DEBUG>=1);
print " @ 2ANCLOOP\n" if ($DEBUG>=1);
print " @ 3ANCLOOP\n" if ($DEBUG>=1);
print " @ 4ANCLOOP\n" if ($DEBUG>=1);
. . .
print " @ 29ANCLOOP\n" if ($DEBUG>=1);
print " @ 30ANCLOOP\n" if ($DEBUG>=1);
it stops at "24ANCHLOOP" - 24 out of 30 meaningless print statements...
BTW, I just replaced the link parsing code (used to use HTML::TokeParser) with WWW::Mechanize, and the same thing still happens, albeit at a slightly different place. Of course, the failure changes each time I add tracing prints... but is completely repeatable, down to the character it fails on in a print, if that's where it is failing.
|