filmo has asked for the wisdom of the Perl Monks concerning the following question:

I'm using the LWP module to plow through a bunch of pages (~10K) sequentially. I've tried using the LWP::Simple 'get' and the LWP::UserAgent 'POST' methods. Both seem extremely slow to retreive the page. (I realize that this is partially a function of the server on the other end.) It seems like it hangs on certain pages longer that others. Is there a way to "timeout" the request after X seconds and move on so that I'm not waiting forever for it to return an UNDEF value?

It seems that it takes the average page about 2 seconds to be requested using either method, but the occational page can take a minute or more and I'd rather either skip them entirely or log them for future lookup, but in either case, simply move quickly to the next page. Thanks.
--
Filmo the Klown

Replies are listed 'Best First'.
Re: LWP Module - timing out
by arhuman (Vicar) on May 23, 2001 at 11:49 UTC
    I would suggest to use Super Search (keywords : LWP timeout)

    You'll find interesting posts, and may discover potential problems :
    • timeout not working when establishing conexion to unreachable host
    • the danger of setting your own sig handler
    • redirection handling (3xx response)

    "Only Bad Coders Badly Code In Perl" (OBC2IP)
Re: LWP Module - timing out
by Anonymous Monk on May 23, 2001 at 11:46 UTC
    From the LWP::UserAgent documentation:
    $ua->timeout([$secs]) Get/set the timeout value in seconds. The default timeout() value is 180 seconds, i.e. 3 minutes.