vit has asked for the wisdom of the Perl Monks concerning the following question:

This node falls below the community's threshold of quality. You may see it by logging in.

Replies are listed 'Best First'.
Re: Running parallel processes without communication
by Corion (Patriarch) on Jan 04, 2011 at 08:25 UTC

    I often use Dominus' runN, which conveniently launches n instances of a program with multiple arguments.

Re: Running parallel processes without communication
by ww (Archbishop) on Jan 03, 2011 at 22:25 UTC
    C'mon vit, this is NOT your first day on the job at the Monastery. In fact, it's a bit past your thousandth.

    What have you tried? Where's your code? What have you searched for?

      Alright :)
      Would this be correct. Looks like it works:
      use strict; my $pid = fork(); if($pid == 0) { `perl write.pl f1.txt`; exit(0); } my $pid = fork(); if($pid == 0) { `perl write.pl f2.txt`; exit(0); } exit(0); ### where write.pl is: use strict; my $file = $ARGV[0]; open OF, ">$file" or die "cannot open file $file\n"; for(1..10) {print OF time()."\n"; sleep 1} exit(0);
      The time printed in files overlaps.
Re: Running parallel processes without communication
by Anonyrnous Monk (Hermit) on Jan 03, 2011 at 22:55 UTC

    One way  (not necessarily the best, as what's best depends on definition):

    #!/bin/bash for i in {1..10} do perl perl_code infile_$i outfile_$i & done

    If you need more flexibility with what's being iterated over, you can of course also use Perl and start the subprocesses with system "perl ... &".

      Maybe the xargs command (from the shell) with the -n and -P flags used wisely it's cleaner and less self-fork-bomb prone
Re: Running parallel processes without communication
by chrestomanci (Priest) on Jan 04, 2011 at 10:24 UTC

    As others have said, the simple solution is to use fork() to create a bunch of sub processes to handle each file. The problem with that approach is that if you have thousands of files, then the simple solution would create thousands of sub processes, which will bring you computer to a crawl.

    If your requirements are simple, and you are running under unix/linux, then you could use the -P argument to xargs.

    eg: Your processing script takes one input file, and can calculate the output file for itself.

    cd ~/directory/with/files/to/process ls -1 | xargs -n1 -P 10 perl -w process_script.pl -args

    Corion Suggested the use of Dominus's RunN Script which does broadly the same thing as xargs.

    If I where in your situation, and my requirements where to complex for xargs, then I would write a script around Parallel::ForkManager.

    This makes it possible to have more complex logic to decide what the output file should be from each input file, or to process some files in different ways. You also get a nice callback mechanism to handle errors and the like, all while the number of concurrent processing threads is limited to a number you specify.

    Of course you could achieve the same by writing your own code using fork and signals, but why bother when there is already something available and debugged on CPAN.

Re: Running parallel processes without communication
by ambrus (Abbot) on Jan 04, 2011 at 18:16 UTC

    Don't forget parallel make (make -j4).

Re: Running parallel processes without communication
by Anonymous Monk on Jan 09, 2011 at 20:50 UTC

    GNU Parallel http://www.gnu.org/software/parallel/ is made for exactly that purpose. If the output_file is derived from input file (e.g. foo.jpg -> foo.png) you can do this:

    cat list | parallel perl perl_code {} {.}.png

    If you have a tab separated table of input/output files:

    cat list.tsv | parallel --colsep '\t' perl perl_code {1} {2}

    If perl_code is a oneliner on the command line you may want to read the section in the man page on QUOTING.

    Watch the intro video to learn more: http://www.youtube.com/watch?v=OpaiGYxkSuQ