coldfingertips has asked for the wisdom of the Perl Monks concerning the following question:

I have a script that parses over anywhere from one to infinity (well, let's hope it doesn't quite go THAT high but I've done as many as 178 links at once). Some servers have taken 10+, 30+ and even more than a few minutes before I got a response and it slows the script. How can I setup a time-out where if it takes more than XX seconds it'll skip it and move on?

A little background: the script takes my url and collects all links on a page, I'm trying to create a spider so I'm using LWP::Simple on MY page, then using it in a loop for all the links it finds. This means it has to wait for the link to either work or die otherwise it just stops and acts like it's busted.

  • Comment on How to make a time-out with LWP::Simple

Replies are listed 'Best First'.
Re: How to make a time-out with LWP::Simple
by b10m (Vicar) on Mar 25, 2004 at 16:54 UTC
      Okay, I went to that and I now know how to set a timeout. The question is now, how do I check for it? Rather than just timing out and moving on, I need to print a message stating it skipped it to browser.
      use CGI qw/:standard/; use LWP::Simple qw(!head $ua); use LWP::UserAgent; use HTML::LinkExtor; use URI::URL; my $ua = LWP::UserAgent->new; $ua->timeout(10); my $p = HTML::LinkExtor->new; my $res = $ua->request(HTTP::Request->new(GET => $url), sub {$p->parse($_[0])}); #### INSERT TIMEOUT CHECK HERE if (timeout) { print "This url timed out"; } if ($res->status_line =~ /NOT FOUND/i) { print "NOT FOUND!"; }
•Re: How to make a time-out with LWP::Simple
by merlyn (Sage) on Mar 25, 2004 at 17:03 UTC
Re: How to make a time-out with LWP::Simple
by eric256 (Parson) on Mar 25, 2004 at 17:03 UTC

    Poe Example of Web Client That fetches the pages in parrallel wich would be a good idea if you want to get lots of pages


    ___________
    Eric Hodges
Re: How to make a time-out with LWP::Simple
by Wonko the sane (Curate) on Mar 25, 2004 at 19:11 UTC
    Hello,

    I am not sure how you could catch a timeout set in UserAgent, but wrapping your call in an eval, with an alarm, would
    probably fit your needs.

    $SIG{ALRM} = sub { die q{timeout} }; URL: foreach my $url ( @urls ) { alarm( 2 ); # amount of time allowed for call. eval { my $page = get( $url ); # LWP::Simple call }; if ( $@ =~ /^timeout/ ) { print qq{Skipping this one, timed out.\n}; next URL; } # Do something with successful call. }

    Wonko