Killing Forks

Earindil has asked for the wisdom of the Perl Monks concerning the following question:

Have a monitoring app written in perl that forks off a bunch of sub processes. Sometimes, these sub processes hang on bad nodes and won't die off. Been trying to use sig alrm to kill them but I just can't seem to get this right. Forking still confuses me a bit, so guessing I'm just not quite getting it here. Would appreciate any input.

Here is some sample code. Every fork should get killed off due to the sleep but isn't.
A bit of the code here was taken from examples I found, some of it probably isn't necessary.

#!/bin/perl
for ($i=0; $i<=10; $i++) {
        wait_for_a_kid() if keys %pid_to_node > 3;
        $pid = fork;
        if ($pid) {
                ## parent does...
                $pid_to_node{$pid} = $i;
        }
        else {
                print "$i $$\n";
                local $SIG {ALRM} = sub {
                kill -15, $$ or die "kill: $!";
                print "\tKilled PID $$\n"}; # Just SIGTERM
                eval {
                        ## child does...
                        setpgrp(0,0);
                        exit !&Test;
                        alarm 1;
                        waitpid $pid => 0;
                };
        }
}
## final reap:
1 while wait_for_a_kid();

sub wait_for_a_kid {
        my $pid = wait;
        return 0 if $pid < 0;
        my $node = delete $pid_to_node{$pid} or warn("Why did I see $p
+id ($?)\n"), next;
}

sub Test {
        sleep 10;
}
[download]

OUTPUT:

0 1668
1 1669
2 1670
3 1671
Why did I see 1668 (0)
5 1811
6 1812
7 1813
8 1814
9 1920
10 1921

Comment on Killing Forks Download Code

Replies are listed 'Best First'.
Re: Killing Forks by jbert (Priest) on Aug 14, 2007 at 12:58 UTC
You're passing a -ve signal number to kill. From kill: "Unlike in the shell, if SIGNAL is negative, it kills process groups instead of processes. (On System V, a negative PROCESS number will also kill process groups, but that’s not portable.) That means you usually want to use positive not negative signals. You may also use a signal name in quotes." You probably also want to set the alarm before doing the operation which may take a long time. I'm not sure, but you might also be able to set the alarm handler (and possibly also call alarm) before the fork. I haven't checked, but those two properties might both be preserved across fork.	[reply]
Re^2: Killing Forks by Earindil (Beadle) on Aug 14, 2007 at 13:17 UTC
Thanks much! Simply moved a few things around and changed the kill to a 15 from a -15 and that seemed to do it. New Version: #!/etrade/bin/perl for ($i=0; $i<=10; $i++) { wait_for_a_kid() if keys %pid_to_node > 3; $pid = fork; if ($pid) { ## parent does... $pid_to_node{$pid} = $i; } else { print "$i $$\n"; local $SIG {ALRM} = sub { print "\tKilled PID $$\n"; kill 15, $$ or die "kill: $!"}; # Just SIGTERM alarm 6; eval { ## child does... exit !&Test; waitpid $pid => 0; }; } } ## final reap: 1 while wait_for_a_kid(); sub wait_for_a_kid { my $pid = wait; return 0 if $pid < 0; delete $pid_to_node{$pid} or warn("Why did I see $pid ($?)\n") +, next; } sub Test { sleep 5; } [download]	[reply] [d/l]
Re^3: Killing Forks by jbert (Priest) on Aug 14, 2007 at 13:34 UTC
Glad to know it's fixed. As I think about it a little, though, sending yourself a SIGTERM is a bit redundant, you could of course just 'die' or 'exit' in the SIGALRM handler. Also...you're calling 'waitpid' in the child (which is odd) after a call to exit (which is odder). The other way of doing this, of course, is to get the parent to periodically wake up (you can do this with waitpid and a sleep) and calculate if it should kill off a child (just store each pid's start time and kill off the old ones, maybe keeping them in a sorted list of age or something). One case where you might need to have the parent do it is if your long-running code calls out to some XS code which doesn't return up to perl when it receives a signal. Since perl 5.8, perl uses 'safe signals' by default, which means that your signal handlers only fire when control returns to the perl interpreter. This can be a long time (or never) when you call out of perl into C/XS code. (Safe signals are 'better', but you sometimes need to be aware of this issue when doing signal handling in perl).	[reply]
Re^4: Killing Forks by Earindil (Beadle) on Aug 14, 2007 at 13:41 UTC
Re^4: Killing Forks by Earindil (Beadle) on Aug 14, 2007 at 14:14 UTC
Re^5: Killing Forks by jbert (Priest) on Aug 14, 2007 at 16:15 UTC
Re^5: Killing Forks by cdarke (Prior) on Aug 14, 2007 at 16:18 UTC