double fork trick vs sig chld wait

Voronich has asked for the wisdom of the Perl Monks concerning the following question:

I've got a service dispatch script that runs and spawns off children at an alarming rate. So the zombie hordes are terrifying. I'm trying to decide between grabbing sig_chld and waiting and 'double fork'.

For the sake of background; the current code uses fork and system (rather than exec), because:

the launch script in the child process needs to play with the return code.
the top-level dispatch loop doesn't care one whit about the child processes; fire and forget.

That's all fine. But in my exhuberance to fire and forget, I've ended up doing rather a lot of forgetting.

Is there a reason to double-fork over setting sig_chld handler to a simple wait?

Is the latter more stable? Less blockilicious (or more?)

- V

http://mpwilson.com/

Comment on double fork trick vs sig chld wait

Replies are listed 'Best First'.
Re: double fork trick vs sig chld wait by chrestomanci (Priest) on Nov 19, 2010 at 15:46 UTC
Sounds to me that you need to have a different signal handler in the top level dispatch loop and the launch script. Something like: `$SIG{CHILD} = 'IGNORE'; foreach my $job (@big_list_of_jobs) { my $pid = fork() unless( $pid ) { my $child_cleanup_func = sub { # Examine the child's output, update log files etc. }; $SIG{CHILD} = $child_cleanup_func; exec $child_cmd, @args; print "Child did not start\n"; } else { # parent # Forget about the child (launcher process from the fork above +) # move on to the next job. } }` [download] I think it would also be a good idea to read perlipc	[reply] [d/l]
Re^2: double fork trick vs sig chld wait by Voronich (Hermit) on Nov 19, 2010 at 15:57 UTC
A little broader context: This script dispatches jobs into a grid platform. It executes the stub binaries, which execute their grid equivalents and wait until they exit, then returning to my script (where I pick up return codes, parse and process schlock from stdout, etc.) I actually tried `$SIG{CHLD} = 'IGNORE';` first. The net result was that the stub binaries wouldn't submit the grid jobs. I threw up my hands at that. I just don't have grid-fu and the people who were supposed to had no idea why that would occur. (But it was exhaustively demonstrated.) So I went back to trying to decide between double-fork and setting `$SIG{CHLD}` to a simple sub that just "wait"s. (Yes, been back and forth through perlipc before posting here. ;) ) http://www.mpwilson.com/uccu/	[reply] [d/l] [select]
Re^3: double fork trick vs sig chld wait by tod222 (Pilgrim) on Nov 19, 2010 at 20:03 UTC
This script dispatches jobs into a grid platform. Does the script wait for the job to complete, or just for the submission to complete? If the script isn't waiting for the job to complete, you shouldn't need to fork children. I've written a script to submit multiple jobs to Grid Engine by constructing a 'qsub' command and executing that via system, and the qsub command completed quickly enough that there wasn't a need to fork. If the grid system you're working with supports DRMAA you might want to look into Schedule::DRMAAc.	[reply]
Re^4: double fork trick vs sig chld wait by Voronich (Hermit) on Nov 19, 2010 at 20:18 UTC
Re: double fork trick vs sig chld wait by Illuminatus (Curate) on Nov 19, 2010 at 16:19 UTC
I guess I'm still a little unclear on what you need to accomplish with regard to child process management in the main loop. Ignoring SIGCLD won't do much for you, as the child's behavior will still be the same (ie, a zombie). Since you are using system rather than exec, maybe you would be better served by using threads to do the work instead of child processes. I would pre-create a pool of 'child-threads', and task them as needed. As of perl 5.10 (I think), the stack_size parameter was supported, so you can create lots of threads without needing reams of memory. fnord	[reply]