supernoob has asked for the wisdom of the Perl Monks concerning the following question:

I know there are at least 10 different ways to send HTTP request in parallel, personally I have been using both threads and forkmanager with lwp useragent. But I have been facing 'out of memory' issues lately so I wonder if any experts here can advise me on a more memory efficient way to get the job done? i heard of mojo and coro, though never tried those, would they be better?
  • Comment on Most memory efficient way to parallel http request

Replies are listed 'Best First'.
Re: Most memory efficient way to parallel http request
by davido (Cardinal) on Jan 04, 2014 at 21:17 UTC

    Here's an example from the Mojolicious Cookbook:

    use Mojo::UserAgent; use Mojo::IOLoop; # Parallel non-blocking requests my $ua = Mojo::UserAgent->new; $ua->get('http://metacpan.org/search?q=mojo' => sub { my ($ua, $mojo) = @_; ... }); $ua->get('http://metacpan.org/search?q=mango' => sub { my ($ua, $mango) = @_; ... }); # Start event loop if necessary Mojo::IOLoop->start unless Mojo::IOLoop->is_running;

    So in this example there are two GET requests sent, and callbacks execute once the requests are fulfilled. The event loop is implemented by Mojolicious's standalone event loop, but if you have EV installed on your system, internally that will be used automatically, for improved performance.

    Non-blocking concurrency seems to be a strong focus in Mojolicious's ongoing development. As a UserAgent, it's probably hard to find a solution that has better non-blocking IO support from its developers.


    Dave

Re: Most memory efficient way to parallel http request
by BrowserUk (Patriarch) on Jan 05, 2014 at 03:13 UTC
    both threads ... with lwp useragent. But I have been facing 'out of memory' issues lately

    That almost certainly means you've failed to control the level of concurrency, and/or not discovered/used the threads stacksize => nnnn runtime option. Post your threaded code and we can probably fix that for you.

    However, the most memory efficient and fastest mechanism for performing parallel downloads that I've used is LWP::Parallel.

    It's not the simplest -- that'd be a thread-pool -- but, used with care and understanding it can easily saturate most broadband connections whilst requiring almost no CPU and very small amounts of memory, and is far simpler than all of the other state-driven modules.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Most memory efficient way to parallel http request
by Preceptor (Deacon) on Jan 05, 2014 at 10:35 UTC

    "Out of Memory" is likely more a symptom of a memory leak than of a coding implementation. There are differences between the different approaches to threading, but fundamentally they're doing pretty similar things. I'd suggest you're more likely to be suffering 'out of memory' as a result of something spawning new instances too aggressively.

    Posting code might mean someone can point out the problem.