ristov has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to write the code which would wait for the child process to exit -- but if the code gets the TERM signal while waiting, the child process should be terminated with the same signal. While waiting for the child can be easily done with waitpid(pid, 0), signals do not interrupt waitpid() with EINTR error code. To solve this problem, one could use the following code with non-blocking waitpit():

$term = 0; $SIG{TERM} = sub { $term = 1; }; for (;;) { $p = waitpid($pid, WNOHANG); # where did the child disappear? if ($p == -1) { exit(1); } # child has terminated if ($p == $pid && (WIFEXITED($?) || WIFSIGNALED($?))) { exit($?>>8); } # TERM has arrived, forward it to child and exit if ($term) { kill('TERM', $pid); exit(0); } # check the child again after 1 second sleep(1); }

However, I am interested whether the same task can be accomplished with blocking waitpid() which consumes less CPU time. There is one very interesting recipe which involves the use of eval:

http://blog.kazuhooku.com/2015/02/writing-signal-aware-waitpid-in-perl.html

Nevertheless, I am wondering whether it would be OK to use the blocking waitpid() with a different signal handler for the same purpose:

# waitpid($pid,WNOHANG) returns 0 if the child process # exists and has not terminated $SIG{TERM} = sub { waitpid($pid,WNOHANG) || kill('TERM', $pid); exit(0 +); } # waitpid loop for (;;) { $p = waitpid($pid, 0); # where did the child disappear? if ($p == -1) { exit(1); } # child has terminated if ($p == $pid && (WIFEXITED($?) || WIFSIGNALED($?))) { exit($?>>8); } }

In the signal handler, waitpid($pid, WNOHANG) is used for checking if the child process exists, in order to avoid sending TERM to a non-existing process. Since I am not too familiar with Perl internals, I am not sure if it is OK if the signal handler is invoked in the middle of blocking waitpid(), in order to call waitpid() again in non-blocking mode from the handler. Can anyone provide some insights? If this approach has flaws, I would go with previous code examples.

regards, risto

Replies are listed 'Best First'.
Re: using waitpid() with signals
by kcott (Archbishop) on Jan 21, 2017 at 03:34 UTC

    G'day ristov,

    Welcome to the Monastery.

    "I am trying to write the code which would wait for the child process to exit -- but if the code gets the TERM signal while waiting, the child process should be terminated with the same signal. ... I am interested whether the same task can be accomplished with blocking waitpid() ..."

    This code demonstrates how you could do this:

    #!/usr/bin/env perl -l use strict; use warnings; use constant SLEEP_TIME => 20; my $pid = fork; die "Can't fork()" unless defined $pid; if ($pid) { print "Parent: $$; Child: $pid"; local $SIG{TERM} = sub { print "Parent received signal: @_"; kill TERM => $pid if kill 0 => $pid; }; print "Waiting on child ..."; waitpid($pid, 0); print "Child terminated."; } else { print "Child: $$"; local $SIG{TERM} = sub { print "Child received signal: @_"; die "Child died via signal handler.\n"; }; sleep SLEEP_TIME; exit; }

    Running without any intervention:

    Parent: 28186; Child: 28187 Waiting on child ... Child: 28187 ... SLEEP_TIME seconds pass Child terminated.

    Running and sending TERM to the parent:

    Parent: 28192; Child: 28193 Waiting on child ... Child: 28193 ... on another command line: kill -TERM 28192 Parent received signal: TERM Child received signal: TERM Child died via signal handler. Child terminated.

    Running and sending TERM to the child:

    Parent: 28201; Child: 28202 Waiting on child ... Child: 28202 ... on another command line: kill -TERM 28202 Child received signal: TERM Child died via signal handler. Child terminated.

    The signal handler in the child was purely for demonstration purposes. Here's the output, from similar runs, with it removed.

    Parent: 28405; Child: 28406 Waiting on child ... Child: 28406 ... SLEEP_TIME seconds pass Child terminated.
    Parent: 28419; Child: 28420 Waiting on child ... Child: 28420 ... kill -TERM 28419 Parent received signal: TERM Child terminated.
    Parent: 28430; Child: 28431 Waiting on child ... Child: 28431 ... kill -TERM 28431 Child terminated.
    "... checking if the child process exists, in order to avoid sending TERM to a non-existing process."

    That's handled by "... if kill 0 => $pid;". See kill.

    See also: perlipc: Signals.

    — Ken

      hi Ken,

      the solution you have suggested looks like my last example, except that the presence of the child process is verified with kill(0, $pid), not waitpid($pid, WNOHANG). The use of kill() involves one caveat, though -- it merely checks that the process with the given PID exists, but that does not mean this process is a child. In order to illustrate this, suppose you are logged in as root and execute the following commandline:

      perl -e 'kill(0, 1) && print "Init is our child\n"'

      On my Linux laptop, this commandline always prints out the message "Init is our child", although this is not true. This caveat can lead to the following race condition -- if the signal handler gets triggered after waitpid($pid, 0) has returned *and* the OS has had enough time to start a new process with the same PID, the new process can mistakenly get the TERM signal. I acknowledge that modern operating systems attempt not to reuse the same PID immediately, but I wouldn't like to rely on that assumption in the code, especially because it will be executing with root privileges and can thus kill any process in the system.

      In contrast, waitpid($pid, WNOHANG) returns 0 for running child processes only, and a new unrelated process with the same PID is not reported as a running child. That's why I was asking if it is OK to call waitpid() again from the signal handler, if the blocking waitpid() call was interrupted by the signal.

      regards, risto
        "This caveat can lead to the following race condition -- if the signal handler gets triggered after waitpid($pid, 0) has returned *and* the OS has had enough time to start a new process with the same PID, the new process can mistakenly get the TERM signal. I acknowledge that modern operating systems attempt not to reuse the same PID immediately, but I wouldn't like to rely on that assumption in the code, especially because it will be executing with root privileges and can thus kill any process in the system."

        You can put the local handler passing the TERM to the child, as well as the waitpid, in an anonymous block. When the waitpid returns, the anonymous block is exited, the handler is popped off the stack, and whatever previous handler was in effect is now in effect again.

        #!/usr/bin/env perl -l use strict; use warnings; use constant SLEEP_TIME => 20; my $pid = fork; die "Can't fork()" unless defined $pid; if ($pid) { local $SIG{TERM} = sub { print "Parent received signal: @_"; die "$$ committing suicide!\n"; }; { local $SIG{TERM} = sub { print "Parent received signal: @_"; print "$$ committing infanticide!"; kill TERM => $pid if kill 0 => $pid; }; print "Parent: $$; Child: $pid"; print "Waiting on child ..."; waitpid($pid, 0); } print "Child terminated."; print "$$ resting ..."; sleep SLEEP_TIME; print "$$ rested and exiting."; } else { print "Child: $$"; local $SIG{TERM} = sub { print "Child received signal: @_"; die "Child died via signal handler.\n"; }; sleep SLEEP_TIME; exit; }

        So I ran it first without any intervention; it slept as expected:

        Parent: 36789; Child: 36790 Waiting on child ... Child: 36790 Child terminated. 36789 resting ... 36789 rested and exiting.

        Next I started it running, then entered "kill -TERM 36797" twice in quick succession:

        Parent: 36797; Child: 36798 Waiting on child ... Child: 36798 Parent received signal: TERM 36797 committing infanticide! Child received signal: TERM Child died via signal handler. Child terminated. 36797 resting ... Parent received signal: TERM 36797 committing suicide!

        Finally, I entered "kill -TERM 36836" followed by "kill -TERM 36835":

        Parent: 36835; Child: 36836 Waiting on child ... Child: 36836 Child received signal: TERM Child died via signal handler. Child terminated. 36835 resting ... Parent received signal: TERM 36835 committing suicide!

        — Ken

Re: using waitpid() with signals
by Anonymous Monk on Jan 21, 2017 at 23:03 UTC

    A couple of thoughts. First, if the parent does nothing else besides waiting on the child, it could have simply exec-uted that without forking. But I assume you've reasons to do it the more complicated way.

    Secondly. If the parent process was invoked from shell as usual, it should be properly running as a process leader. Cleaning up can then be made simple and robust: just kill the entire process group.

    The act of waiting on a subtask can also be accomplished differently. One might select or read on a pipe/socketpair to detect the termination of the other end.

    AFAIC, there is nothing wrong with having the waitpid in a signal handler as in your second example.