in reply to Re: Managing the fork/execing and reaping of child processes
in thread Managing the fork/execing and reaping of child processes

OK, I had time to look into it futher.

First, wow! I didn't even know that POSIX sigaction()... bypasses Perl safe signals - perlipc. Hmmm, makes sense but shouldn't this phrase be in POSIX? Anyway, I've removed all printing from the program and run it under strace. One failure mode is like I said:

--- SIGCHLD (Child exited) @ 0 (0) --- rt_sigaction(SIGCHLD, NULL, {0x49ff50, [], SA_RESTORER|SA_NODEFER, 0x7 +fdd9a6810a0}, 8) = 0 --- SIGCHLD (Child exited) @ 0 (0) --- rt_sigaction(SIGCHLD, NULL, {0x49ff50, [], SA_RESTORER|SA_NODEFER, 0x7 +fdd9a6810a0}, 8) = 0 rt_sigreturn(0x7fdd99b95e40) = 0 rt_sigreturn(0x7fdd99b95e40) = 0 time([1437206528]) = 1437206528 pause(^C <unfinished ...>
It hangs here because there are no signals anymore.

Interestingly enough, sometimes something else happens:

rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 --- SIGCHLD (Child exited) @ 0 (0) --- --- SIGCHLD (Child exited) @ 0 (0) --- rt_sigaction(SIGCHLD, NULL, {0x49ff50, [], SA_RESTORER|SA_NODEFER, 0x7 +efbfde8a0a0}, 8) = 0 rt_sigreturn(0x7efbfd39ee40) = 136 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV +++
That happens when two signals are delivered in rapid succession. Typically that is after call to sigprocmask( SIG_SETMASK, empty_set, NULL ) (signals are blocked before call to clone, that is, fork and unblocked after). It seems one signal is pending and is delivered, and another one is also delivered immediately afterwards (but it can also happen without sigprocmask, just when it so happens that two children terminate one right after another). That causes Perl 5.22 to get SIGSEGV. Removing SA_NODEFER just always causes it to hang in the call to sleep after a while.

So yeah, the combination of Perl's unsafe signals and half of UNIX unreliable signals doesn't work too well :-) (the other half of unreliable signals is SA_RESETHAND)

Just use Parallel::ForkManager :-)