Re^2: Perl and autoparallelization

Splitting the files is an attractive and simple solution provided the CPU requirements for processing each file are comparable or there are sufficient of them for the differences to average out. Suppose that assumption isn't true? How would you use Perl to manage a jobqueue which sent the next file for processing to the next available processor?

Comment on Re^2: Perl and autoparallelization

Replies are listed 'Best First'.
Re^3: Perl and autoparallelization by BrowserUk (Patriarch) on Jun 07, 2004 at 18:26 UTC
I'd use one worker thread per processor and a Thread::Queue of the files to be processed. The main thread sets up (or feeds, if the list is very large eg. >~10,000) the Q with the files to be processed. The threads take the first file off the Q, process it and then loop back and get the next until the Q is empty. This is extremely simple to code and since 5.8.3 appears to be very stable as far as memory consumption is concerned, though I haven't run any really long runs using Thread::Queue. Once the threads are spawned, no new threads or processes need to to be created or destroyed which make it pretty efficient. All the sharing and locking required is taken care of by the tested and proven Thread::Queue module. I might try varying the number of threads up and down to see what gave the optimal throughput. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail	[reply]
Re^3: Perl and autoparallelization by paulbort (Hermit) on Jun 07, 2004 at 18:24 UTC
Another choice might be POE. I haven't used it, but it seems like it would be a good choice for this kind of load-balancing thing, especially if there's a chance that it will later exceed the capacity of one machine. -- Spring: Forces, Coiled Again!	[reply]