in reply to fixed set of forked processes
I would approach this sort of problem by defining a fixed and configurable (small) number of threads, all of which are built to do the same thing: to read a work-request from a single queue (e.g. Thread::Queue::Duplex), perform the unit of work (in an eval{} block), and write a response-record to the same or to a different queue.
All of the threads, no matter how many there are, are reading and writing from the same queues. So, when a record is written to “the request queue,” no one really cares which thread winds up picking-up the request and running it.
The threads, in turn, are built to survive. Any runtime error that may occur during processing is absorbed, and a record of that event is merely added to the response-record for someone else down the line to deal with.
To avoid too-much competition for the “single file,” you might dedicate one thread to the task of reading a block of records from the file and shoving them into the request queue. By some appropriate means, let the thread snooze until the number of enqueued items drops below some threshhold, at which time it reads a few more records from the file to recharge the queues.
In this way, the jobs are indeed “processed in parallel,” but you maintain control over the attempted multiprogramming-level at all times. Such a system could perform work at a predictable and steady rate no matter how many jobs ultimately needed to be run. The size of that file would not affect the rate at which work was carried out; only the amount of wall-time required to do it.
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: fixed set of forked processes
by BrowserUk (Patriarch) on Dec 02, 2010 at 20:13 UTC | |
|
Re^2: fixed set of forked processes
by anonymized user 468275 (Curate) on Dec 02, 2010 at 18:33 UTC |