Marcello has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I have a parent process (which is a package) with the following loop:

while (1) { $self->check_start(); $self->check_run(); # Sleep 1/10 second select(undef, undef, undef, 0.1); }
This runs perfectly forever, without every stopping (check_start and check_run fork off child processes if needed to do some work). To clean up zombie processes, I added the following code:

$SIG{CHLD} = \&reaper; while (1) { $self->check_start(); $self->check_run(); # Sleep 1/10 second select(undef, undef, undef, 0.1); } sub reaper { my $pid; do { $pid = waitpid(-1, &WNOHANG); } while ($pid > 0); $SIG{CHLD} = \&reaper; }
Since then, my parent process is being killed every now and then (check_start and check_run were left unchanged)

Question 1: Does this have anything to do with the added code?
Question 2: How can I determine what killed to parent process, so I can fix this? (I already catch SIGTERM and SIGINT, which are not the signal killing the parent)

TIA!

Title change per author's request - dvergin 2002-04-23

Replies are listed 'Best First'.
Re: Determining proces
by Elian (Parson) on Apr 23, 2002 at 19:15 UTC
    Signal handlers in perl aren't 100% safe, at least not up until recently. (I think you need a perl in the 5.7.x range for safe signals) Because of that, this'll occasionally die, because perl does things in signal handlers that aren't legal. (No memory allocation allowed, for one)
      Hi Elian,

      Thank you for your clear answer, is my solution then still the preferred way to clean up zombie processes or is there a better one which does NOT use signals (and therefore my parent process never stops)?

      Regards,
      Marcello
        Try this:
        $SIG{CHLD} = 'IGNORE';
        Depending on your platform that should prevent zombie processes.

        -sam

        It's about the best you can manage. If you don't declare lexicals in the signal handler you may find yourself more (though not completely) stable. Whatever you do, don't use those variables anywhere outside the signal handler!

        If you've got Inline, you could always try writing the signal handler in C, since what you're doing doesn't really require any perl interaction.

Re: Determining proces
by perlplexer (Hermit) on Apr 23, 2002 at 19:29 UTC
    From perlipc

    ...doing nearly anything in your handler could in theory trigger a memory fault and subsequent core dump.

    In your case I would simply do this:
    while (1) { $self->check_start(); $self->check_run(); # Sleep 1/10 second select(undef, undef, undef, 0.1); 1 while (waitpid(-1, &WNOHANG) > 0); }
    --perlplexer
Re: Determining proces
by Fletch (Bishop) on Apr 23, 2002 at 19:02 UTC

    Consider running the parent under strace or truss (or whatever your platform has) and you should be able to see what's happening at the system call level when it dies.