Re: Need suggestion on problem to distribute work

Forking every time, in the loop, is expensive (and by the way make sure that you do not fork after you have a huge data structure already in memory because it will be duplicated! sort-of but see System call doesn't work when there is a large amount of data in a hash). And each of these newly created children will have to set, from scratch, a one-off connection to db or random_client which is also expensive.

Regarding sharing db connections between forked children, see forking and dbi but you need to make sure it is up-to-date. The answer is yes-and-no (at that time). As for sharing sockets (i.e. connections to random_client) between forks, it can be done. But I do not know the cost.

So instead of spawning ephemeral children, perhaps you must consider a Pool of Workers (queue L'Internationale :) ) (see for example Re^3: thread/fork boss/worker framework recommendation and Implementing Custom ThreadPool) each listening to a separate port where you send them the data, either via shared memory or IPC. In this way each of the workers keeps its own connectivity with db or random_client alive and tighly enclosed in its own space. But you need to create an enormous amount of workers in order to be just 99.99% sure (can't be 100%!) that there will always be a worker free when data arrives so that you do not need to implement queue and throttler in the middle.

Perhaps a webserver solved your problems already?

Comment on Re: Need suggestion on problem to distribute work