comment on

Greetings !
I am writing a parallel processing application. The process flow goes something like this:

f1()
    f2() in background.
    f3() in background.
f4()
wait for 'f2'
f5()
wait for 'f3'
[download]

where f1, f2 ...etc. are subroutines.
In this model, both parent and the children do the work. Some steps in parent can proceed only if some of the work has been completed by earlier spawned children.

All tasks performed in each step is critical, and if any of them fail, then the parent process must kill all children and shut itself down. For instance, if 'f3' encountered an error, like a SEGV or someone accidentally sending it a SIGKILL, then parent should know about it and exit immediately.

However, the problem is that parent does several blocking tasks and would know about a child's death, only when it calls 'wait'. The existing codebase has several CHLD handlers installed at various points. So using SIGCHLD is ruled out.

I tried various alternatives:
1. Install an ALRM handler in the parent. This handler would periodically check if the child has exited, and if so, it would die. But if any module installs an ALRM handler the whole scheme would fail. I tried using Alarm::Concurrent, but found it to be buggy. Also, the existing code has sleep() in couple of places, which may interfere with the ALRM handler.
2. Tie %SIG and capture CHLD, and override with a custom subroutine. But this scheme also needs wait() and waitpid() are overridden. This may be doable, but looks too complicated, and may affect several parts of the existing code that spawns forground processes.
3. At various parts of the parent code, call a function that would check if the child has exited. This won't work when the parent is in a blocking call.
4. Make the parent do no work, other than simple child monitoring. This needs lots of code changes for my existing application, and is not feasible.

Alas, UNIX signals suck ! :-( Anyone has been stuck with this problem before ? Any suggestions ?
thanks

In reply to How to asynchronisly get notified of a child exit by Mostly Harmless

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.