DreamT has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
I want to run a script that:
1. Prints some information
2. Fetches a file via LWP::UserAgent
3. Does more stuff, without waiting for the file to be fetched.

My current solution times out because the file takes to long to fetch. Can I move on in the script asynchronously?
  • Comment on GET a file without waiting for the result?

Replies are listed 'Best First'.
Re: GET a file without waiting for the result?
by locked_user sundialsvc4 (Abbot) on Sep 22, 2010 at 13:34 UTC

    You could designate a (single!) thread whose job is “to do fetches.”   It would service a thread-safe queue:   sleep until a filename appears on the queue, pop it off, fetch the file, and post a completion-message onto some other queue.   The number of occurrences of this thread would determine how many fetches could be going on simultaneously.

Re: GET a file without waiting for the result?
by moritz (Cardinal) on Sep 22, 2010 at 13:34 UTC
    When you want non-blocking actions, you nearly always also want an event loop. LWP::UserAgent::POE should work if you chose POE as event loop.

    If you are not settled on LWP::UserAgent for getting the files, you could also take a look at AnyEvent::HTTP, Mojo::Client and other goodies that CPAN has to offer.

    Perl 6 - links to (nearly) everything that is Perl 6.
Re: GET a file without waiting for the result?
by JavaFan (Canon) on Sep 22, 2010 at 13:33 UTC
    My current solution times out because the file takes to long to fetch.
    Uhm, if the file takes a long time to fetch, it's going to take a long time to fetch even if you do something else in the mean time. If you get a time out during the fetch, it's unlikely to not time out if you do it "asynchronously".

    As for doing it "asynchronously", there are three classical solutions: use an event loop, fork a new process, spawn a new thread. I don't see how any of them will help with a timeout.

      Hmm. The main problem is that the script is supposed to run via Apache, and is supposed to print "1", which will signal that it's finished. After this it will fetch the file and do some other stuff, but this won't be printed to the user.

      Maybe it's possible to tell Apache that the script is finished immediately after the "1" is printed, so that the web page stops loading, but the script continues to work in the "background"?
      There isn't a problem that the script takes time, the problem is that the system that calls the URL can't wait for the webpage to load.

        It's curious that you don't want any error checking (in which case you could use Watch Long Processes Through CGI).

        I can see two approaches.

        • Spawn a task to do the work.

          use POSIX qw( setsid ); print "Content-Type: text/plain\n"; print "\n"; print "1"; my $pid = fork(); exit(0) if !defined($pid) || $pid; open STDIN, '</dev/null' or die; open STDOUT, '>/dev/null' or die; open STDERR, '>&STDOUT' or die; setsid(); ... handle request ...
        • Assign the work to an existing task or to a cron job.

          ... queue request in a file or database ... print "Content-Type: text/plain\n"; print "\n"; print "1";
Re: GET a file without waiting for the result?
by Proclus (Beadle) on Sep 22, 2010 at 22:10 UTC
    Ah, my most favorite topic. Asynchronous programming. :)
    Perl has an excellent solution: POE.
    It will have a learning curve, but it is well worth the time and effort. I am using POE-Component-Client-HTTP myself, but you can give it a try with moritz's suggestion: LWP::UserAgent::POE