Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have a subroutine which calls fork() to spread out jobs over processes. I fork() a process and dojob($job) for every $job, and wait() for all children to finish. It's working as expected... However, now that I think about it more critically, this is obviously flawed b/c wait() waits for any child process, not necessarily those that were spawned by the subroutine. How should I group only relevant children? I don't think waitpid() is useful here since I don't know the order which the jobs will finish. Should I fork twice, once to create a 'controller' child, which itself forks the 'worker' grandchildren? What's the etiquette here?

Replies are listed 'Best First'.
Re: subroutines which forks?
by Fletch (Bishop) on Sep 09, 2010 at 22:08 UTC

    Simplest answer is, "Don't". Use Parallel::ForkManager (or perhaps POE and POE::Wheel::Run) and let it worry about such niceties for you.

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

      Except that does almost the exact equivalent of the OPs code! (Did you bother to look?)

Re: subroutines which forks?
by ikegami (Patriarch) on Sep 09, 2010 at 23:29 UTC

    The parent can't do two things at the same time, so some kind of central child management is required. You can limit the amount of knowledge the child manager has about the child creators by using callbacks.

    my %children; sub forker1 { my $resource = ...; my $pid = ...; $children{$pid} = sub { print "forker1 child $pid exited with \$?=$?\n"; ... clean up $resource ... }; } sub forker2 { my $resource = ...; my $pid = ...; $children{$pid} = sub { print "forker2 child $pid exited with \$?=$?\n"; ... clean up $resource ... }; } ... forker1() for 1..3; forker2() for 1..3; while (keys(%children)) { my $pid = wait(); my $handler = delete($children{$pid}); $handler->() if $handler; }
Re: subroutines which forks?
by BrowserUk (Patriarch) on Sep 10, 2010 at 07:47 UTC
    Should I fork twice, once to create a 'controller' child, which itself forks the 'worker' grandchildren?

    fork isn't something I use much, but it strike me that forking twice so that the first level of fork can wait upon the second just moves the goalposts. You still have the problem of waiting for those first levels kids in the originator.

    I guess you could use async:

    for my $job ( @jobs ) { my $pid = fork // die 'fork failed'; if( $pid ) { async{ waitpid $pid; }->detach; } else { dojob( $job ); } }

    Of course, then it might simply be easier to skip the forking completely:

    for my $job ( @jobs ) { async \&dojob, $job; } while( threads->list( 1 ) ) { $_->join for threads->list( 0 ); }

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: subroutines which forks?
by salva (Canon) on Sep 10, 2010 at 07:15 UTC
    I don't think waitpid() is useful here since I don't know the order which the jobs will finish

    Usually you don't need to reap the children in the same order they finish. Something like...

    for my $pid (@pids) { my $r = waitpid($pid, 0); # do error checking here, and ensure that $r is really $pid }
    ... will do.

    The only drawback of this approach is that if you run ps on the machine you will see some zombie processes in some cases. It is fully harmless but some people gets nervous when they see zombies!

Re: subroutines which forks?
by JavaFan (Canon) on Sep 10, 2010 at 07:32 UTC
    I don't think waitpid() is useful here since I don't know the order which the jobs will finish.
    So what? If you have to wait for John, Mary, Bob and Ellen, does it really matter that while you're waiting for John, Bob and Mary arrive, waiting their turn in the queue to be ticked of your list?

    Just waitpid for all the pids in some order. If a child has finished before you've waited for it, waitpid returns instantly.

    However, now that I think about it more critically, this is obviously flawed b/c wait() waits for any child process, not necessarily those that were spawned by the subroutine.
    This suggests that elsewhere in the program you're forking children as well, and not waiting for them (or at least, not waiting for it now) before forking and waiting in the subroutine you're describing.

      So what? If you have to wait for John, Mary, Bob and Ellen, does it really matter that while you're waiting for John, Bob and Mary arrive, waiting their turn in the queue to be ticked of your list?

      It often does. It you do something in response to your children finishing (such as cleaning up resources or starting on the next work unit), those actions would be delayed.

Re: subroutines which forks?
by JavaFan (Canon) on Sep 10, 2010 at 22:23 UTC
    Here's a way to reap your children, reaping as soon as they're gone, and without reaping any other child processes. It uses the fact sleep gets interrupted on a signal:
    #!/usr/bin/perl use 5.010; use strict; use warnings; use autodie; use POSIX ':sys_wait_h'; $SIG{CHLD} = sub {1;}; my $start = time; my %children; foreach (1 .. 10) { if (my $pid = fork) { $children{$pid} = 1; } else { my $diff = time - $start; say "$diff: Child $$"; sleep rand 10; $diff = time - $start; say "$diff: Child $$ out of here"; exit; } } while (1) { foreach my $pid (keys %children) { if (my $r = waitpid($pid, WNOHANG)) { delete $children{$pid}; my $diff = time - $start; say $r == $pid ? "$diff: Reaped $pid" : "$diff: $pid gone! +"; } } last unless %children; my $diff = time - $start; say "$diff: Sleeping"; sleep 1; } my $diff = time - $start; say "$diff: End"; __END__ 0: Child 3975 0: Child 3976 0: Child 3976 out of here 0: Child 3977 0: Child 3978 0: Child 3979 0: Child 3980 0: Child 3981 0: Child 3982 0: Child 3983 0: Child 3984 0: Reaped 3976 0: Sleeping 1: Child 3977 out of here 1: Reaped 3977 1: Sleeping 2: Child 3979 out of here 2: Child 3980 out of here 2: Reaped 3979 2: Sleeping 2: Reaped 3980 2: Sleeping 3: Child 3982 out of here 3: Reaped 3982 3: Sleeping 4: Child 3978 out of here 4: Reaped 3978 4: Sleeping 4: Child 3981 out of here 4: Reaped 3981 4: Sleeping 4: Child 3984 out of here 4: Reaped 3984 4: Sleeping 5: Child 3983 out of here 5: Reaped 3983 5: Sleeping 8: Child 3975 out of here 8: Reaped 3975 8: End