in reply to IPC::Open, Parallel::ForkManager, or Sockets::IO for parallelizing?

My guess is that you're mostly blocked on waiting for the network, so I would look at Mojolicious / Mojo::UserAgent or Net::Async::HTTP as user agents for fetching the data, and then processing whatever just came in, potentially in a separate process.

I'm not really aware of a convenient fetching framework, but using a Future-based approach has worked well for me in my toy scraper, COWS, resp. the fetching queue and the program operating on such a queue.

Alternatively, I would look at any kind of persistent queue (database, filesystem, Queue) and just use whatever existing program you have, and just fire them off using (say) Parallel::ForkManager.

  • Comment on Re: IPC::Open, Parallel::ForkManager, or Sockets::IO for parallelizing?