in reply to How to make parent wait for all the child processes.

Hi,

First, I have some comments on your design. Later on I offer an alternative approach you might find easier to use.

Your children should exit when they are done, otherwise each one of your fork-ed children will begin executing your "main" program code after your initial for loop.

Next, you don't really want to call waitpid in a loop like that waiting on each child sequentially since children may exit (or terminate unexpectedly) in any order. If you don't reap children as the parent continues to wait or do further processing, you'll accrue zombie processes that won't finish existing until you wait on them.

I'd encourage you to learn about how to "reap" finished children. The perldoc page perlipc has a section titled Signals that talks about signal handling, and specifically the CHLD signal I'll use below.

If you look at the code below, I'll walk you through what it's doing briefly.

  1. A SIGCHLD handler is defined to call the reap_kids() routine (discussed later.)
  2. A hash called %kids is created, which will contain child PIDs as keys. These can be added/deleted in any order (unlike an array, which we'd have to reference by index.)
  3. Children are launched, similar to your sample code. However, notice this time that the child side makes sure to exit when it is finished.
  4. The main code now prints that it is waiting, and goes into a sleep loop as long as at least 1 child is still around.
  5. Here's where the magic happens. The reap_kids() subroutine gets called when the kernel informs the parent process one of the children has finished running. This may be a normal or unusual termination, and you would have to check the exit status to get more information. (see the $? definition in the perlvar docs for specifics.)
  6. The dispatch_child() routine is what the children "do" in this code. Here they simply say the ID (from the main loop near the top of the code), sleep a random number of seconds, then say goodbye.

Something to notice about this code is that I have intentionally made the sleep random, so children exit in a random order. If you run this program multiple times, you'll see that the children do not necessarily end in the same order, and won't usually end sequentially. This is why it's wise to use signal handling to reap any children that have finished.

Here's the code:
use strict; use warnings; require POSIX; # provides WNOHANG # Set up a signal handler. # This prevents child processes from becoming zombies. # The subroutine is defined further below. $SIG{CHLD} = \&reap_kids; # Track spawned child processes. # This is a hash, keyed by the PID. # We'll use this later to mark them as finished. my %kids; for my $child_id (1..3) { my $pid = fork(); # PARENT: track kid if ($pid) { $kids{$pid} = 1; } # CHILD: dispatch to child-routine below. # Explicitly exit, in case the child code neglects. else { dispatch_child( $child_id ); exit(0); } } print "Main code now waiting for all children.\n"; # Wait for all children to finish. # The signal handler, reap_kids(), catches finished children. # Just continue to sleep if there are pending processes. while( scalar(keys %kids) > 0 ) { sleep 1; } # That's all the main code does. print "All done with main code.\n"; exit(0); # --- # Suborutines # --- # Child reaper. # This will be called when the kernel tells us a process has # finished running. It's possible more than 1 has done so. # We run this in a loop to reap all children. sub reap_kids { local $!; # good practice. avoids changing errno. while (1) { my $kid = waitpid( -1, POSIX->WNOHANG ); last unless ($kid > 0); # No more to reap. delete $kids{$kid}; # untrack kid. } } # Child dispatch code. # Here is where you write what your child does. # Ideally it should exit, but we enforce this above anyway. sub dispatch_child { my $id = shift; # passed from caller. print "Hello from child number $id\n"; # Sleep for a random number of seconds. # Between 5 and 10. my $seconds = 5 + int(rand(6)); sleep $seconds; print "Goodbye from child number $id\n"; # Be nice and explicitly exit: exit(0); }

Replies are listed 'Best First'.
Re^2: How to make parent wait for all the child processes.
by Apero (Scribe) on Dec 02, 2015 at 17:55 UTC

    I was asked in a private message the purpose of localizing $! in the reap_kids() routine, and would just as soon reply to my original post to describe it for the benefit of others.

    The $! variable (defined here in the perlvar docs) holds what C programmers know as the errno status. This (possibly) holds information about the last system call failure.

    In Perl code that makes system calls and wishes to track or report on the specific types of errors such calls may return, it is necessary to localize Perl's $! variable. You can read more about how exactly local works by reading about Temporary Values via local() from the perlsub docs.

    In my earlier code example, it's necessary for the SIGCHLD signal handler to localize this in case the main code loop is busy doing something that (might) use this variable. Without the signal handling keeping its own temporary copy, the value could change which would alter the behavior of the main code. Generally any signal handler should localize the global punctuation variables changed, either explicitly (like $/) or implicitly (like $! discussed here.)

Re^2: How to make parent wait for all the child processes.
by gjoshi (Sexton) on Nov 25, 2015 at 06:03 UTC

    Thx Apero. It was very nice explanation. I got what you are trying to say. one more twist i need to add. What if I fork another process before forking there 3 processes?

    example:
    #!/usr/local/bin/perl -w use strict; use warnings; use IO::Socket::INET; use IO::Handle; use IO::Select; require POSIX; # provides WNOHANG # Set up a signal handler. # This prevents child processes from becoming zombies. # The subroutine is defined further below. $SIG{CHLD} = \&reap_kids; my $IsParent = 1; my $ListenPort = 5000; my $PPID = $$; sub StartServer { (my $Port) = @_; print "StartServer($Port) \n"; my %data; my %nextpass; my $bufs =""; my $tbufs; my $TempSessionID = "12345"; $0="UE_ServerListen_$TempSessionID\_Port_$Port"; # Rename process my $socket = new IO::Socket::INET ( LocalHost => '127.0.0.1', LocalPort => $Port, Proto => 'tcp', Listen => $Port, Blocking => 0, Reuse => 1 ) or die "ERROR in Socket Creation : $!\n"; # Bound to 127.0.0.1:$Port my $select = IO::Select->new($socket) or die "IO::Select $!"; # server waiting for client connection on port $Port print "Server started listen on port $Port \n"; my $IsRunning = 1; while($IsRunning) { # Fix For UEC process stays alive in the BG my $exists = kill 0, $PPID; print "Parent is dead exiting...\n" if (!$exists); $data{'Sys_Status'} = 0 if (!$exists); $IsRunning = 0 if (!$exists); my @ready_clients = $select->can_read(0); foreach my $fh (@ready_clients) { if($fh == $socket) { # New client $fh (total clients: (($select->count)-1) my $new = $socket->accept(); $select->add($new); } } @ready_clients = $select->can_read(0); foreach my $fh (@ready_clients) { if($fh != $socket) { $fh->recv(my $tmpbufs,1024); if ($tmpbufs) { my $bufs=""; my $tbufs=""; $bufs=$nextpass{$fh} if $nextpass{$fh}; if (substr($tmpbufs, -1) ne "\n") { for (my $x=length($tmpbufs);$x>=1;$x--) { if (substr($tmpbufs, $x,1) eq "\n") { $tbufs=substr($tmpbufs,0, $x); $nextpass{$fh}=substr($tmpbufs,$x+1); last ; } } $bufs.=$tbufs; } else { my $tmp = $nextpass{$fh}; if ((defined($tmp)) && ($tmp ne '')) { $bufs= $nextpass{$fh}.$tmpbufs; $nextpass{$fh}=""; } else { $bufs= $tmpbufs; $nextpass{$fh}=""; } } chomp($bufs); foreach my $buf ((split(/\n/,$bufs))) { chomp($buf); (my $command,my $key,my $value)=split(/\|/,$buf); if ($command =~ /Die/) { $IsRunning = 0; exit +; } elsif ($command =~ /w/) { $data{$key}=$value;} elsif ($command =~ /r/) { print $fh "$data{$ke +y}\n"; } elsif ($command =~ /m/) { delete $ +data{$key}; } elsif ($command =~ /e/) { my $ResV +al = exists($data{$key}); print $fh "$ResVal\n"; } elsif ($command =~ /s/) { printTS( +$value . "\n",-1,0.000,$key,1); } elsif ($command =~ /k/) { my $count=0; foreach my $key (keys %data) { print $fh "$key\n"; } print $fh "EOT\n"; } else { print "BAD: $buf \n $bufs|"; } } } else { # Client disconnected $select->remove($fh); # Connection Closed: $fh close($fh); # Total connected clients => (($select->count)-1) } } } select(undef,undef,undef, .1); } $socket->close(); } sub ConnectToServer { (my $host ,my $Port ,my $cmd_timeout)= @_; print "ConnectToServer($host ,$Port ,$cmd_timeout) \n"; alarm ($cmd_timeout); my $t_recv = new IO::Socket::INET ( PeerHost => $host, PeerPort => $Port, TimeOut => $cmd_timeout, Blocking => 1, Proto => 'tcp',) or print "Unable make connection\n"; alarm 0; return $t_recv; } sub KillServer { (my $SocketHndl) = @_; print "KillServer($SocketHndl)\n"; if ((defined($SocketHndl)) && ($SocketHndl != 0)) { print $SocketHndl "Die\n"; } } sub WriteValueToDB { (my $Value, my $DbHndl, my $HashKey) = @_; print "WriteValueToDB($Value, $DbHndl, $HashKey) \n"; if ((defined($DbHndl)) && ($DbHndl != 0)) { print "Value is $HashKey: $Value, To DB $DbHndl \n"; print $DbHndl "w|$HashKey|$Value\n"; } } sub ReadValue { (my $Key ,my $SocketHndl) = @_; print "ReadValue($Key ,$SocketHndl) \n"; if ((!defined($SocketHndl)) || ($SocketHndl == 0)) { print "SOCKET H +ANDLE IS EMTPRT \n"; return undef; } if (!DoesExists($Key ,$SocketHndl)) { print "CAN NOT FIND THE REC \n +"; return undef; } $SocketHndl->send("r|$Key\n"); my $data=<$SocketHndl>; chomp($data); return $data; } sub DoesExists { (my $Key, my $SocketHndl) = @_; print "DoesExists($Key ,$SocketHndl) \n"; my $data; if ((defined($SocketHndl)) && ($SocketHndl != 0)) { $SocketHndl->send("e|$Key\n"); $data=<$SocketHndl>; chomp($data); } return $data; } #start the IPC server #----------------------------- MAIN ---------------------------- my $IsPortBusy = 0; my $socket; do { $IsPortBusy = 0; $ListenPort++; $socket = new IO::Socket::INET( LocalHost => '127.0.0.1', LocalPort => $ListenPort, Proto => 'tcp', Listen => $ListenPort, Blocking => 0, Reuse => 1) or $IsPortBusy = 1; } while ($IsPortBusy); $socket->close(); undef($socket); my $ServerListenPid = fork(); if (!$ServerListenPid) { $IsParent = 0; StartServer($ListenPort); } sleep 1; # Track spawned child processes. # This is a hash, keyed by the PID. # We'll use this later to mark them as finished. my %kids; for my $child_id (1..3) { my $pid = fork(); # PARENT: track kid if ($pid) { $kids{$pid} = 1; } # CHILD: dispatch to child-routine below. # Explicitly exit, in case the child code neglects. else { dispatch_child( $$ ); exit(0); } } print "Main code now waiting for all children.\n"; # Wait for all children to finish. # The signal handler, reap_kids(), catches finished children. # Just continue to sleep if there are pending processes. while( scalar(keys %kids) > 0 ) { sleep 1; } # That's all the main code does. print "All done with main code.\n"; exit(0); # --- # Suborutines # --- # Child reaper. # This will be called when the kernel tells us a process has # finished running. It's possible more than 1 has done so. # We run this in a loop to reap all children. sub reap_kids { local $!; # good practice. avoids changing errno. while (1) { my $kid = waitpid( -1, POSIX->WNOHANG ); last unless ($kid > 0); # No more to reap. delete $kids{$kid}; # untrack kid. } } # Child dispatch code. # Here is where you write what your child does. # Ideally it should exit, but we enforce this above anyway. sub dispatch_child { my ($id) = @_; # passed from caller. print "Hello from child number $id\n"; # Sleep for a random number of seconds. # Between 5 and 10. my $seconds = 5 + int(rand(6)); sleep $seconds; print "Goodbye from child number $id\n"; # Be nice and explicitly exit: exit(0); }

    Now what is happening is I am getting something like below. I am not sure now how to stop this. One all 3 kids are done I don't want them to execute again. Any suggestions on this too?


    StartServer(5001)
    Server started listen on port 5001
    Hello from child number 1814
    Hello from child number 1815
    Main code now waiting for all children.
    Hello from child number 1816
    Goodbye from child number 1814
    Goodbye from child number 1816
    Goodbye from child number 1815
    All done with main code.
    $:~/Temp$ Parent is dead exiting...
    Hello from child number 2133
    Hello from child number 2134
    Main code now waiting for all children.
    Hello from child number 2135
    Goodbye from child number 2134
    Goodbye from child number 2133
    Goodbye from child number 2135
    All done with main code.

      Your StartServer() subroutine suffers from the same problem your original code did in that it never exits when it finishes in your child process. This means you get a second process that continues executing after you invoke that routine.

      The parent process runs the for loop with my code from above, waits for the 3 children it starts to finish, and stops. However, you still have the child you fork()-ed before you called that StartServer() routine. The child just executes code from that point, including the later forks that you seem to only want in your parent.

      I suspect you want the child codepath in your modified code above mine to call exit() after its work is done. This is why the example I gave you does this inside the conditional testing the PID (and actually calls it from the child processing routine, so the main-level code is really just a backup in case someone mistakenly calls return() from it.)

      You might also consider putting your main code above your subroutines. It's a bit hard to read when your code looks like this ...

      ... since one has to read the entire program to figure out if there's more code or just more subroutines. It would be better in my short example to put thing3() above all the subroutines.