jaiieq has asked for the wisdom of the Perl Monks concerning the following question:

I have a script that runs an external command. If the command fails to respond within 20 seconds, I kill it off and proceed to the rest of the script. However, when I have to "kill" the process, it is leaving a zombie child process.

Since this command can be run hundreds to thousands of times in a given period, the zombie processes add up and eventually the system will return with a "Resources not available, unable to fork," and eventually the script will die.

Here is the code in question (which I found using sites like this). Is there a better way to do this, or is my code just screwy.

eval { local $SIG{ALRM} = sub { die "alarm\n" }; alarm 20; $CHILD = fork(); if( $CHILD == 0 ) { exec( "my external command" ); } waitpid( $CHILD, 0 ); alarm 0; }; if( $@ ) { die unless $@ eq "alarm\n"; kill 9, $CHILD }

What I need is a way to execute a external command via perl. Have it timout after a certain period if it fails to respond, and then kill the child process if it hits the timeout period.

This is on OS X 10.6 with the latest version of Perl

EDIT: By just simply adding another waitpid after the kill command (as suggested), the zombies are gone. Not sure why I never tried that before! Thanks for the help!

Replies are listed 'Best First'.
Re: exec creating zombie processes
by ikegami (Patriarch) on Feb 14, 2011 at 19:08 UTC

    exec creating zombie processes

    exec doesn't create processes. exec cannot cause processes to become zombies.

    when I have to "kill" the process, it is leaving a zombie child process.

    In unix, whenever a program exits (from a signal or otherwise), it becomes a zombie until it is reaped by its parent. This serves two purposes: It reserves the PID until the parent knows that it cannot use it anymore, and it allows the parent to obtain the exit code of the child.

    In the code path where you kill the child, you don't reap the child. Add a call to waitpid.

Re: exec creating zombie processes
by ELISHEVA (Prior) on Feb 14, 2011 at 19:35 UTC

    Is this your actual code? You are subtracting fork() from $CHILD rather than setting $CHILD to the pid returned by $fork. That would mean that kill 9, $CHILD is in reality calling kill 9, undef and isn't killing anything at all. It would also mean that if ( $CHILD == 0) should be spitting out warnings, but only if you are using use strict; use warnings. Are you?

      That was a typo, sorry! (And yes I always use use strict and use warnings)
Re: exec creating zombie processes
by Perlbotics (Archbishop) on Feb 14, 2011 at 19:06 UTC

    What happens if you use kill -9, $CHILD (negative number)? This should kill the whole process group (process + children). See also Signals in perlipc. Not sure if that works with OS X though...

Re: exec creating zombie processes
by JavaFan (Canon) on Feb 14, 2011 at 20:50 UTC
    The problem is, when there is a timeout, your parent process is waiting. Since you interrupt the wait, the zombie is never reaped.

    Instead of adding another wait, why not just set $SIG{CHLD} = 'IGNORE';?

      Cause it complicates things. First, that wouldn't work. You'd have to change the existing waitpid to sleep. You'd end up with:

      { local $SIG{CHLD} = 'IGNORE'; ... launch child ... if (sleep(20)) { kill(KILL => $child_pid); # Wait for child to die before restoring $SIG{CHLD}. # Unlikely race condition. while (kill(0, $child_pid)) { sleep(1) or last; } } }

      There are side-effects to using $SIG{CHLD} = 'IGNORE'.

      • One can't know the exit code of one's children,
      • one can't use system, backticks, open '-|' or open '|-', and
      • one needs to worry about EINTR.
Re: exec creating zombie processes
by locked_user sundialsvc4 (Abbot) on Feb 14, 2011 at 19:08 UTC

    I must be missing something.   Why can’t you put another waitpid() call in your code, after the kill?

    Yes, that does mean that, after firing the fatal bullet, your code must wait for the flailing corpse to actually hit the ground, but ...

    As you well know, Unix systems are designed to keep dead children in this “zombie” state so that you can gather their final-status by doing a waitpid.   You do this in one case but not the other.   If an exception or signal is not thrown, you reap the (dead) child correctly.   But if one is, your one waitpid call is never executed and you do not execute another one in your exception-handling block.   I think that is the crux of your problem.