kulls has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
I have 10,000 files. I want to pass this files to perl program (one by one) for computation. Resultes are Independent, writes to a common file. Can anyone suggest me any POE Samples for Running Parallel programs on this specific requirement. Should I go for threads otherwise ? If so, which is better ?

Thanks,
Raja K

Replies are listed 'Best First'.
Re: POE Examples
by Corion (Patriarch) on Jul 14, 2014 at 11:27 UTC

    Depending on the amount of error checking you need, runN might already be enough to run multiple processes in parallel. Alternatively, have a look at Parallel::ForkManager.

    If you are on Windows, using fork() gets a bit hairier. I would use threads together with system() to do the processing mostly in the children as spawned by system(). For managing / limiting the children, look at Re^3: Multi-threads newbie questions, which should be a good skeleton on how to use a set of worker threads working on a (larger) set of tasks.

      I agree with Corion, although if you are running on Windows, I strongly suggest avoiding Parallel::ForkManager and instead use threads, along with Thread::Queue to manage your inputs and outputs.

      The way I have tackled this type of scenario is to have one queue with all the input files listed, a set of processing threads, which pick from that queue, process the files and "print" the output to another queue. A final thread is responsible for picking from the output queue and writing directly to file without performing any processing. Its a simple model that re-uses existing threads, reducing the overhead of threading your app.

        Thanks.I have used threads and It's working fine.
Re: POE Examples
by locked_user sundialsvc4 (Abbot) on Jul 14, 2014 at 13:12 UTC

    ... and if you are on Unix/Linux, you can do something with:
    ls -1 dirname | xargs -Pnprocs

    ... in order to run one Perl program that, say, writes its output to a named pipe (mkfifo).   (xargs takes care of all the parallelism.)   Meanwhile, another process cats the input from that pipe (from whatever source) to the specified target file.   Alternatively, if you have good file-locking, the Perl program briefly locks the target file before writing to it.

    In this way, “the Unix way,” you use the existing facilities of the Shell to do complicated things using simple individual programs that each do only one thing.

Re: POE Examples
by Anonymous Monk on Jul 14, 2014 at 21:33 UTC

    “The Unix way,” is not limited to shell facilities. The best tool for many automated tasks (and many a problem posed on PM) is make.

    Makefile rules may instruct intermediary files to be kept; in this case, the independent results could be catenated together as the final step. GNU make has the -j option to specify the number of parallel jobs.