karthi.ge has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

We are planning to upgrade our servers from IBM AIX - V5.1 to V5.2. But the existing perl codes which work fine in AIX - V5.1(perl5.6.0) are hanging in AIX - V5.2(perl5.8.0). The reason is this.

We have four perl scripts a)loader, b)load_domain, c)run_command and d)execute command. loader triggers, load_domain which in turn triggers run_command which in turn triggers execute_command. The perl script execute_command triggers Unix shell scripts.

The perl script run_command is not able to reap the process id of it's grandchild(the one triggered by execute_command). The command used was waitpid(-1, WNOHANG).

This command gets the process id of it's grandchild but the return code is a negative 1 (-1). So the unidentified grand child script stays as a zombie and the grandfather perl script(run_command) hangs indefinitely.

The following piece of code is used to reap the grandchild. ************************************************************

sub loader_process_reaper { use POSIX qw/:sys_wait_h/; while ((my $pid = waitpid(-1, WNOHANG)) != -1) { if (WIFEXITED($?)) { $domain_loaders{$pid}{SYSTEM_STATUS} = WEXITSTATUS($?); $domain_loaders{$pid}{LOADER_STATUS} = RETURNED; } } }

************************************************************

Could someone please tell us why this happens. We are breaking our head over this issue.

Thanks,

Karthik

Edit g0n - added code & formatting tags

Replies are listed 'Best First'.
Re: wait & waitpid commands
by zentara (Cardinal) on Dec 04, 2005 at 13:12 UTC
    It would be nice if you would put your code into code tags. See the link "Writeup Formatting Tips" on the preview page.

    I'm not really a human, but I play one on earth. flash japh
Re: wait & waitpid commands
by Celada (Monk) on Dec 05, 2005 at 21:36 UTC

    In UNIX, you cannot reap your grandchildren, only your direct children. If the intermediate process (your direct child, the grandchild's parent) dies, then the grandchild is reparented to init (process ID 1) and you, as the grandparent, still have no opportunity to reap it (init will reap it automatically).

    The behaviour is expected: waitpid will hang until something happens to one of its direct children. AIX 5.1 must have been buggy if it did anything else!

A reply falls below the community's threshold of quality. You may see it by logging in.