tfoertsch has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I have to manage many parallel HTTP/HTTPS connections. My current implementation starts up a number of threads (300-500). Each thread handles one LWP::UserAgent. Then I use a queue to shift connection requests from the manager thread to the workers.

Since Perl's thread implementation is rather memory consuming this is not the optimal solution. 500 threads is the maximum on my 1GB machine.

I want to change the code to use non-blocking IO and select()ing ready file handles without loosing the comfort of LWP::UserAgent.

What is the best way? Is that feasible with LWP::UserAgent? Is there a better solution to handle many parallel connections? I need HTTP and HTTPS.

Thanks,
Torsten

Replies are listed 'Best First'.
Re: managing many parallel HTTP requests
by merlyn (Sage) on Sep 09, 2005 at 13:30 UTC
    Sounds like a job for POE. Using the non-blocking LWP client, you end up with exactly what you're looking for. DNS requests and web page fetches will do their thing as if in parallel, with callback events as the page comes in, or as things finish. In fact, that even sounds like one of my columns.

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

Re: managing many parallel HTTP requests
by Corion (Patriarch) on Sep 09, 2005 at 11:13 UTC

    There are some ways to keep both, the comfort of LWP and the parallelism.

    One would be to fork() instead of using Perl threads. This method comes with the price of additional IPC, which I'd handle via a database.

    Another way would be to use Coro, which will come at the cost of a new XS module to install and some slight instability introduced by Coro, depending on what your platform is.

    Yet another way could be to simply spread out the work across more machines of course.

      If you decide to use fork, take a look at Parallel::ForkManager, which will do much of the dirty work for you.