<solved> fork, but not waiting on children

Monk::Thomas has asked for the wisdom of the Perl Monks concerning the following question:

Hello fellow monks

I'd like to write a small wrapper script which receives a command and a url as input. It then should try to run the command and if that succeeds create a child/thread to wait for successful completion of the command and finally query the URL. The main program should return as quickly as possible signaling either successful command start or complete failure (e.g. command not found). (The actual command may run for something between a couple of minutes and some hours.)

- try to run the command
- success?
  - spawn child: wait for completion and call URL
  - terminate with exit == 0
- failure?
  - terminate with exit != 0

It's a bit like backgrounding a process, but I want immediate feedback if there is an error.

I've read some tutorials about forking/threads and they did warn that if I'm not waiting on forked child processes to complete then I'll create zombie processes. So before I doing something stupid: How would you implement that? Is forking the correct solution? How to avoid creating zombies?

code is not tested / verified

my $ch = Proc::Background->new(@command);
if (defined $ch->pid()) {
  my $pid = fork();
  if (not defined $pid) {
    die "Cannot fork: $!";
  }
  elsif ($pid == 0) {
    # child process
    my $exitcode = $ch->wait() << 8;
    if ($exitcode == 0) {
      # call URL
    }
  }
  else {
    # parent process
    #waitpid($kidpid, 0); <- do not wait for child
    exit 0;
  }
}
else {
  # failed to execute
  exit 127;
}
[download]

Target architecture is Linux, no need to care about Windows/UNIX pecularities.

$ perl /tmp/scratch.pl true
[child] Executing true
[child] Exiting with exit code 0.
[parent] Command has already terminated. (exit code = 0)
[parent] Terminating with exit code 0.

$ perl /tmp/scratch.pl false
[child] Executing false
[child] Exiting with exit code 1.
[parent] Command has already terminated. (exit code = 1)
[parent] Terminating with exit code 1.

$ perl /tmp/scratch.pl sleep 5
[child] Executing sleep 5
[parent] Command is still running.
[parent] Terminating with exit code 0.
  <after 5 seconds>
[child] Exiting with exit code 0.

$  perl /tmp/scratch.pl "sleep 5 ; false"
[child] Executing sleep 5 ; false
[parent] Command is still running.
[parent] Terminating with exit code 0.
  <after 5 seconds>
[child] Exiting with exit code 1.
[download]

Solution 1 - manual fork

#!/usr/bin/env perl

use strict;
use warnings;

use 5.010_001;

# essential libraries
use English                    qw( -no_match_vars );
use POSIX                      qw(:sys_wait_h);

my @cmd     = @ARGV;
my $cmd_str = join ' ', @cmd;

my $pid = fork();
if (not defined $pid) {             # fork failed
    die "Cannot fork: $ERRNO\n";
}
elsif ($pid == 0) {                 # child process
    say("Executing $cmd_str");

    my $retval   = system(@cmd);
    my $exitcode = parse_retval($retval);

    # TODO add URL callback
    say("[child] Exiting with exit code $exitcode.");
    exit $exitcode;
}
else {                              # parent process
    # wait a bit to give child process a chance to die prematurely
    sleep 1;

    my $cstatus = waitpid($pid, WNOHANG);
    my $exitcode;
    if ($cstatus == 0) {
        $exitcode = 0;
        say("[parent] Child is still running.");
    }
    else {
        $exitcode = parse_retval(${^CHILD_ERROR_NATIVE});
    }

    say("[parent] Exiting with exit code $exitcode.");
    exit $exitcode;
}

sub parse_retval {
    my $retval = shift;

    my $signal        = $retval & 127;
    my $have_coredump = $retval & 128;

    my $exitcode;
    if ($retval == -1) {
        # program failed to execute
        $exitcode = 127;
    }
    elsif ($signal) {
        my $message = "W: Process died with signal $signal!";
        if ($have_coredump) {
            $message .= ' (coredump available)';
        }
        warn($message ."\n");
        $exitcode = 255;
    }
    else {
        $exitcode = $retval >> 8;
    }

    return $exitcode;
}
[download]

Solution 2 - Proc::Simple

#!/usr/bin/env perl

use strict;
use warnings;

use 5.010_001;

# essential libraries
use English                    qw( -no_match_vars );
use POSIX                      qw(:sys_wait_h);
use Proc::Simple;

my @cmd = @ARGV;
my $url;          # unused - function not implemented

my $child = Proc::Simple->new();
$child->start(\&child, \@cmd, $url);

# wait a bit to give child process a chance to die prematurely
sleep 1;

my $exitcode;
if ($child->poll() == 1) {
    $exitcode = 0;
    say("[parent] Command is still running.");
}
else {
    $exitcode = parse_retval( $child->exit_status() );
    say("[parent] Command has already terminated. (exit code = $exitco
+de)");
}

say("[parent] Terminating with exit code $exitcode.");
exit $exitcode;

sub parse_retval {
    my $retval = shift;

    my $signal        = $retval & 127;
    my $have_coredump = $retval & 128;

    my $exitcode;
    if ($retval == -1) {
        # program failed to execute
        $exitcode = 127;
    }
    elsif ($signal) {
        my $message = "W: Process died with signal $signal!";
        if ($have_coredump) {
            $message .= ' (coredump available)';
        }
        warn($message ."\n");
        $exitcode = 255;
    }
    else {
        $exitcode = $retval >> 8;
    }

    return $exitcode;
}

sub child {
    my @cmd = @{ +shift };
    my $url = shift;

    my $cmd_str = join ' ', @cmd;

    say("[child] Executing $cmd_str");

    my $retval   = system(@cmd);
    my $exitcode = parse_retval($retval);

    # TODO add URL callback
    say("[child] Exiting with exit code $exitcode.");
    exit $exitcode;
}
[download]

Comment on <solved> fork, but not waiting on children Select or Download Code

Replies are listed 'Best First'.
Re: fork, but not waiting on childs by blindluke (Hermit) on Sep 23, 2014 at 12:09 UTC
I'd suggest using waitpid, even though your intention is NOT to wait for the child process. You spawn the job, wait 1 second, and then check the waitpid return value. If it's -1, you report that the job exited with an error. If it's 0, the job is running - and now you spawn a child to periodically check the return value of the waitpid until it's !=0. I'll try to update the comment with a code example, just give me a while. Oh, and be sure to invoke waitpid in a nonblocking way, `waitpid($pid, WNOHANG);` [download] regards, Luke Jefferson	[reply] [d/l]
Re^2: fork, but not waiting on children by Monk::Thomas (Friar) on Sep 23, 2014 at 17:53 UTC
Thanks for the suggestion Luke, I think I got it now. I'm a bit surprised. I don't do anything special regarding the zombie issue but top does not list any zombies - although the documentation warned about this and the parent process has already died? (I executed `./forker.pl find /usr -type f >/dev/null`.) code #!/usr/bin/env perl use strict; use warnings; use 5.010_001; # essential libraries use English qw( -no_match_vars ); use POSIX qw(:sys_wait_h); my @cmd = @ARGV; my $cmd_str = join ' ', @cmd; my $pid = fork(); if (not defined $pid) { # fork failed die "Cannot fork: $ERRNO\n"; } elsif ($pid == 0) { # child process say("Executing $cmd_str"); my $retval = system(@cmd); my $exitcode = parse_retval($retval); # TODO add URL callback say("[child] Exiting with exit code $exitcode."); exit $exitcode; } else { # parent process # wait a bit to give child process a chance to die prematurely sleep 1; my $cstatus = waitpid($pid, WNOHANG); my $exitcode; if ($cstatus == 0) { $exitcode = 0; say("[parent] Child is still running."); } else { $exitcode = parse_retval(${^CHILD_ERROR_NATIVE}); } say("[parent] Exiting with exit code $exitcode."); exit $exitcode; } sub parse_retval { my $retval = shift; my $signal = $retval & 127; my $have_coredump = $retval & 128; my $exitcode; if ($retval == -1) { # program failed to execute $exitcode = 127; } elsif ($signal) { my $message = "W: Process died with signal $signal!"; if ($have_coredump) { $message .= ' (coredump available)'; } warn($message ."\n"); $exitcode = 255; } else { $exitcode = $retval >> 8; } return $exitcode; } [download] result $ ./forker.pl true ; echo "EXIT $?" Executing true [child] Exiting with exit code 0. [parent] Exiting with exit code 0. EXIT 0 $ ./forker.pl false ; echo "EXIT $?" Executing false [child] Exiting with exit code 1. [parent] Exiting with exit code 1. EXIT 1 $ ./forker.pl sleep 10 ; echo "EXIT $?" Executing sleep 10 [parent] Child is still running. [parent] Exiting with exit code 0. EXIT 0 <10 seconds later> [child] Exiting with exit code 0. $ ./forker.pl bogus ; echo "EXIT $?" Executing bogus Can't exec "bogus": No such file or directory at ./forker.pl line 22. [child] Exiting with exit code 127. [parent] Exiting with exit code 127. EXIT 127 $ ./bogus ; echo "EXIT $?" bash: ./bogus: No such file or directory EXIT 127 [download]	[reply] [d/l] [select]
Re^3: fork, but not waiting on children by blindluke (Hermit) on Sep 23, 2014 at 18:39 UTC
Glad to see you figured it out, and it's working. I started to play around with a proof of concept, and then I went looking for some nice way to do it (hiding the nasty flags under the hood). I found a very promising module: Proc::Simple. It looks like something that could be useful to you. As for the zombies, don't worry: if a parent terminates without getting the child exit status (without wait()ing for the child), the init process will wait() for the child instead, so it does not become a zombie. You can look at the notes section in man 2 wait, the whole thing is described there very precisely. All the best. regards, Luke Jefferson	[reply]
Re^4: fork, but not waiting on children by Monk::Thomas (Friar) on Sep 25, 2014 at 08:06 UTC