monkfan has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

Currently I have a script below that process multiple files iteratively, but in "serial" manner.
Is there a way to modify the code below, so that I can run the script in multiple instances in "parallel" manner? Assuming parameters required for each process are the same.
#!/usr/bin/perl -w use strict; my @ARGV = ("file1.fasta","file2.fasta","file3.fasta"); # etc, there are 50 of this. for (@ARGV) { run_code ($_); } sub run_code { my $file=shift; # some algo specific parameters. # process my file here }
Regards,
Edward

Replies are listed 'Best First'.
Re: How to Run Script for Multiple Files in Parallel?
by thor (Priest) on Apr 18, 2005 at 04:51 UTC
    I like Parallel::ForkManager. I doubt that you'd want to run all of them at once, but rather something like a maximum of 5 at a time until the queue is done. Parallel::ForkManager handles this for you.

    thor

    Feel the white light, the light within
    Be your own disciple, fan the sparks of will
    For all of us waiting, your kingdom will come

Re: How to Run Script for Multiple Files in Parallel?
by sk (Curate) on Apr 18, 2005 at 05:56 UTC
    This idea is definitely not as elegant as Thor's but if you are looking for a quick way to do it then why not let the OS take care of the multiple instances? It will be easy to write a shell script (couple of lines) that calls your Perl script with the required file arguments many times. If you wanted to process two different routines in the same script this idea will not work but if you just want multiple file handling then the shell might give you a quick solution albeit not so elegant!

    script file1 file2 file3 & script file4 file5 file6 & ..... ....

    This also assumes that you are not creating any *static* temp file names (i.e. hard coded) within your script!

    cheers

    SK

Re: How to Run Script for Multiple Files in Parallel?
by salva (Canon) on Apr 18, 2005 at 08:31 UTC
    you can use module Proc::Queue, that also limits the number of processes running simultaneously.
    use Proc::Queue size=>5, qw(run_back); for (@ARGV) { run_back { run_code($_) }; } 1 while wait != -1;
      Thanks Salva,

      I tried your module. It's easy to use indeed.
      BTW how can we verify that they are actually running in parallel? I tried "top" but it only shows one instances.

      Regards,
      Edward
        To see what is happening you can use Proc::Queue trace mode, just add to your script...
        use Proc::Queue qw(run_back); Proc::Queue::trace(1);
        Also, wrap the forked code inside an eval block and report errors:
        run_back { eval { ... parallel code here ... }; print STDERR $0."[$$]: $@\n" if $@; };
Re: How to Run Script for Multiple Files in Parallel?
by tweetiepooh (Hermit) on Apr 18, 2005 at 08:43 UTC
    I've used a technique like the following.
    Mind you I only have 3-5 copies running each
    taking between 1 and 3 hours under cron.

    #!perl if (<some parameter not set>) { <set up environment etc> foreach (@somesortoflist) { system(<thisscript><withparameter><&>); } exit; } <remainder of script>
Re: How to Run Script for Multiple Files in Parallel?
by Forsaken (Friar) on Apr 18, 2005 at 12:34 UTC
    Not entirely sure on *why* you would want to, but my personal preference would be to use threads for something like this. 1 process, multiple worker threads and perhaps 1 factory thread to control the whole thing.

    Remember rule one...
      What is rule one (despite from holli is always right)?


      holli, /regexed monk/
        Something along the lines of "be very careful of small wrinkly bald monks".

        Terry Pratchett, Thief of Time.

        Remember rule one...