Depending on the amount of error checking you need, runN might already be enough to run multiple processes in parallel. Alternatively, have a look at Parallel::ForkManager.
If you are on Windows, using fork() gets a bit hairier. I would use threads together with system() to do the processing mostly in the children as spawned by system(). For managing / limiting the children, look at Re^3: Multi-threads newbie questions, which should be a good skeleton on how to use a set of worker threads working on a (larger) set of tasks.
| [reply] [d/l] [select] |
I agree with Corion, although if you are running on Windows, I strongly suggest avoiding Parallel::ForkManager and instead use threads, along with Thread::Queue to manage your inputs and outputs.
The way I have tackled this type of scenario is to have one queue with all the input files listed, a set of processing threads, which pick from that queue, process the files and "print" the output to another queue. A final thread is responsible for picking from the output queue and writing directly to file without performing any processing. Its a simple model that re-uses existing threads, reducing the overhead of threading your app.
| [reply] [d/l] |
Thanks.I have used threads and It's working fine.
| [reply] |
... and if you are on Unix/Linux, you can do something with:
ls -1 dirname | xargs -Pnprocs
... in order to run one Perl program that, say, writes its output to a named pipe (mkfifo). (xargs takes care of all the parallelism.) Meanwhile, another process cats the input from that pipe (from whatever source) to the specified target file. Alternatively, if you have good file-locking, the Perl program briefly locks the target file before writing to it.
In this way, “the Unix way,” you use the existing facilities of the Shell to do complicated things using simple individual programs that each do only one thing.
| |
“The Unix way,” is not limited to shell facilities. The best tool for many automated tasks (and many a problem posed on PM) is make.
Makefile rules may instruct intermediary files to be kept; in this case, the independent results could be catenated together as the final step. GNU make has the -j option to specify the number of parallel jobs.
| [reply] |