Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

Once again I call on your help because I obviously don't really understand forking yet, although I have been reading up :-/

In the script below, there is something I want to do after sending off the children within the loop. But for some reason, everything after $fork_manager->finish never happens, and I don't understand why. I thought maybe the problem was the "... and next" at the start of the fork, because I've had problems with that in subs before. However, at the time, I enclosed the block in an extra {...}, and it worked. That doesn't help here.

use strict; use warnings; use diagnostics -verbose; use Parallel::ForkManager; my @array = ("") x 56; my $fork_manager = new Parallel::ForkManager(16); my $check = 0; for my $val (@array) { $fork_manager->start and next; $check++; $fork_manager->finish; # Terminates the child process print "Hello?\n"; # $fork_manager->start and next; # $check++; # $fork_manager->finish; } print "End: " . $check . "\n";

In addition, what I want to do relies on the processes within that iteration of the loop being done. I thought of using $fork_manager->wait_all_children, but doesn't that mean that ALL processes must be done, e.g. I could use it only outside the loop?

And then finally, there is one thing I am not sure about Fork::Manager's documentation: What if after the 1st bunch of processes in the loop, I want to start so more? Can I use the same object or should I create a new one? The documentation does contain a note about managing a set of subprocesses within the 1st one, but that's different from what I want to do here (I think).

I would gladly be enlightened :-)

Replies are listed 'Best First'.
Re: Problem with [mod://Fork::Manager]
by RichardK (Parson) on May 30, 2012 at 08:50 UTC

    The help for finish in Parallel::ForkManager say

    Closes the child process by exiting

    so therefore the child can't do any more work after the exit.

    If you want to do work in the parent after starting a child then you'll need to do something like

    while (@list) { if ($pm->start) { # do parent stuff next; } # child stuff goes here $pm->finish; }

    BUT that seems a strange way to use Parallel::ForKManager, so maybe you need to rethink your design.

    If you explain what you're really trying to do, perhaps someone here can make some useful suggestions.

      Thanks, I didn't get the crucial meaning about exiting in the documentation. Learning to understand the documentation is also learning - I remember how I struggled with man pages years ago :-/

      Here's what I want to do in pseudo-kind-of-code, apart from a bunch of extra loops that shouldn't have any influence.

      my @array1 = (1..5); # N my @array2 = qw/a b/; for my $val1 (@array1) { # Start N processes here that can run in parallel # Each process outputs data to its own separate file # I will need it in the future for my $val2 (@array2) { # For each of the N processes, wait until it is done, # then start 2 parallel processes which use # the output data as input # Save the output of each process separately to 2N files # (two files with N elements would be better, but as I # couldn't figure that out, I just postprocess the data ;-) } } # "waitallchildren" or equivalent # Postprocess step to reduce the final output to 2 files

      I just tried this, but it doesn't work. The outer part does, but as soon as I uncomment the inner part, my prompt doesn't return anymore. No idea what's going on.

      use Proc::Fork; for my $val1 (@array1) { run_fork { child { open FILE, ">$val1.txt"; print FILE "Output of step 1\n"; close FILE; } parent { my $child_pid_outer = shift; waitpid $child_pid_outer, 0; # for my $val2 (@array2) { # run_fork { # child { # open FILE1, "$val1.txt"; # open FILE2, ">$val2$val1.txt"; # while (my $line = <FILE1>) { # $line =~ s/1/2/; # print FILE2 $line . "\n"; # } # close FILE1; # close FILE2; # } # parent { # my $child_pid_inner = shift; # waitpid $child_pid_inner, 0; # } # Parent inner loop # }; # Inner fork # } # For loop } # Parent outer fork }; # Outer fork }

      I did have a (short) look at other packages, but I must admit I didn't understand much, unfortunately.

        I just wanted to mention that I reposted this under a more appropriate title, since I don't care about the module, as long as it works: Help with multiple forks. I got several answers through which I am now going through.

Re: Problem with [mod://Fork::Manager]
by locked_user sundialsvc4 (Abbot) on May 30, 2012 at 14:40 UTC

    They say that “there’s more than one way to do it,” and I think that in the case of multiprogramming there is definitely a right way and a not-right way.   Instead of launching processes and then having one parent process looking over their shoulders, launch one or more pools of processes and make each one the master of its own affairs.

    For example, you could have a pool of x stage-one processes and y stage-two processes, such that both x and y are tunable parameters.   A stage-one process pulls a unit of work off of some queue, performs it, and then writes a completion message to another queue.   Stage-two processes are listening to that queue.   And so on, right down the line.

    What you have now built is a production-line.   Each worker at each station does one task and does it independently.   Buffered queues hold a variable amount of work-in-process.   Having launched all of the processes, the parent process might have absolutely nothing further to do except to watch for termination-signals coming in from the children.:   the parent does not have to “do the right thing at the right time” to keep the whole machine running smoothly.   It is unpredictable from one moment to the next what task(s) will be underway and at what point of completion, but eventually all of it will come to an orderly point of conclusion.

    It should of course go without saying that there are numerous complete frameworks within CPAN for implementing scenarios such as this one.   You don’t have to write much; you merely have to select what works best for you and for this project.   “Actum Ne Agas:   Do Not Do A Thing Already Done.”

      What modules do you recommend for creating buffered queues between processes?

      Which complete frameworks are you referring to?