Re: CGI to Pull links off webpage fails on second run

Could be any number of things. It's really hard to say without some code to look at (nudge nudge). . .

MrCromeDome

Comment on Re: CGI to Pull links off webpage fails on second run

Replies are listed 'Best First'.
Re: Re: CGI to Pull links off webpage fails on second run by cdherold (Monk) on Apr 10, 2003 at 21:13 UTC
a little code ... this is just the link retrieval section, but it alone will not run two times in a row (except over 30+ minutes between runs). $url = "http://biz.yahoo.com/rf/archive.html"; $ua = new LWP::UserAgent; # Set up a callback that collect links my @links = (); sub callback { my($tag, %attr) = @_; return if $tag ne 'a'; # only look closer at written d +ocuments, not images push(@links, values %attr); + } # Make the parser. $p = HTML::LinkExtor->new(\&callback); # Request document and parse it as it arrives $res = $ua->request(HTTP::Request->new(GET => $url), sub {$p->parse($_[0])}); #Expand all URLs to absolute ones my $base = $res->base; @links = map { $_ = url($_, $base)->abs; } @links; print "Links: <P>@links<p>"; exit; [download]	[reply] [d/l]

Replies are listed 'Best First'.

Re: Re: CGI to Pull links off webpage fails on second run
by cdherold (Monk) on Apr 10, 2003 at 21:13 UTC

  $url = "http://biz.yahoo.com/rf/archive.html"; 
  
  $ua = new LWP::UserAgent;
  # Set up a callback that collect links
  my @links = ();
sub callback {
              my($tag, %attr) = @_;
              return if $tag ne 'a';  #  only look closer at written d
+ocuments, not images
              push(@links, values %attr);
                                                                      
+                }

  # Make the parser.  
  $p = HTML::LinkExtor->new(\&callback);

  # Request document and parse it as it arrives
  $res = $ua->request(HTTP::Request->new(GET => $url),
                      sub {$p->parse($_[0])});
                                          

 #Expand all URLs to absolute ones
 my $base = $res->base;
 @links = map { $_ = url($_, $base)->abs; } @links;


print "Links: <P>@links<p>";

exit;
[download]

[reply]
[d/l]