sojourn548 has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am using Parallel::ForkManager to spawn several child processes to run an external command and return the exit code from the children back to the parent to enter into a DB. I am able to print out the exit code by using run_on_finish callback, but I haven't figured out how to process all the exit codes returned by the children, from the parent. The documentation states that the run_on_finish() is called in the parent process, but it seems like there is no way to access any of the variables of the parent. (Unless I am doing something wrong.) Eventually, I'd like to insert all the exit codes from the children into the mysql db connection from the parent.

What's the best way to do this? I've considered declaring a hash from the parent and add the key-value pairs from run_on_finish() callback, but it looks like the only parameters that are passed into run_on_finish() are: pid, exit_code, ident, exit_signal, and core

Am I barking up the wrong tree here? Should I explore using IPC::Sharable or pipes to communicate between the parent and the children?

Thanks in advance, and here's the code snippet:

my $MAX_PROCESSES = 120; my $pm = new Parallel::ForkManager($MAX_PROCESSES); $pm->run_on_finish( sub { my ($pid, $exit_code, $ident) = @_; print "run_on_finish: $ident (pid: $pid) exited " . "with +code: [$exit_code]\n"; somehow_pass_exit_code_and_ident_to_parent_for_processing(); + } ); $pm->run_on_start( sub { my ($pid,$ident)=@_; print "** $ident started, pid: $pid\n"; } ); for(my $i=0; $i< @hosts; $i++){ $pm->start($hosts[$i]) and next; system(@some_command); $return_code = $? >> 8; $pm->finish($return_code); }

Replies are listed 'Best First'.
Re: Parallel::ForkManager run_on_finish exit code handler
by ikegami (Patriarch) on Sep 08, 2009 at 15:17 UTC

    but it seems like there is no way to access any of the variables of the parent.

    From the on_finish handler? That makes no sense. What did you try?

    Or from the child? The only communication P::FM gives you is the exit code. If that's not good enough, you'll need to create your own. (Assuming that's possible.)

      Okay, I spoke too soon. Accessing a hash defined in the parent IS read/writable from run_on_finish().

      my %rc_hash = (); my $MAX_PROCESSES = 120; my $pm = new Parallel::ForkManager($MAX_PROCESSES); $pm->run_on_finish( sub { my ($pid, $exit_code, $ident) = @_; print "run_on_finish: $ident (pid: $pid) exited " . "with cod +e: [$exit_code]\n"; $rc_hash{$pid} = $exit_code; } ); $pm->run_on_start( sub { my ($pid,$ident)=@_; print "** $ident started, pid: $pid\n"; } ); for(my $i=0; $i< @hosts; $i++){ $pm->start($hosts[$i]) and next; system(@some_command); $return_code = $? >> 8; $pm->finish($return_code); } foreach my $key (keys %rc_hash){ print "$key => $rc_hash{$key}\n"; }

      I made an incorrect assumption.. What I was trying to do earlier is use a database handle from run_on_finish() and that didn't work, so I thought it applied to all other variables. I am not sure why accessing a hash would work, but accessing a db handle wouldn't. Is this possible?

      my $dbh = DBI->connect("DBI:mysql:database=mydb;host=$DB", "user", +"pass", {'RaiseError' => 1}); my $MAX_PROCESSES = 120; my $pm = new Parallel::ForkManager($MAX_PROCESSES); $pm->run_on_finish( sub { my ($pid, $exit_code, $ident) = @_; print "run_on_finish: $ident (pid: $pid) exited " . "with +code: [$exit_code]\n"; insert_into_db(\$dbh, $pid, $exit_code); + } ); $pm->run_on_start( sub { my ($pid,$ident)=@_; print "** $ident started, pid: $pid\n"; } ); for(my $i=0; $i< @hosts; $i++){ $pm->start($hosts[$i]) and next; system(@some_command); $return_code = $? >> 8; $pm->finish($return_code); } $dbh->disconnect(); sub insert_into_db{ my $dbhdl = shift; my $pid = shift; my $ret_code = shift; $$dbhdl->do(INSERT INTO system_results ... .. ); }

      If I can't use the db handle from run_on_finish(), I can still add the return code to %rc_hash and process them after $pm->finish(). Is this the right thing to do? thanks for the speedy responses. I hope I explained my problem a little better this time around.

        What I was trying to do earlier is use a database handle from run_on_finish()

        The child inherited the database handle and closed it. P::FM uses exit when it should probably use POSIX's _exit. finish is simply a call to exec, so if you call _exec instead of finish, that problem should go away.

        That means you should replace

        for(my $i=0; $i< @hosts; $i++){ $pm->start($hosts[$i]) and next; system(@some_command); $return_code = $? >> 8; $pm->finish($return_code); }
        with
        use POSIX qw( _exit ); for(my $i=0; $i< @hosts; $i++){ $pm->start($hosts[$i]) and next; system(@some_command); $return_code = $? >> 8; _exit($return_code); }

        Of course, system + _exit is a silly way of doing exec, so you get:

        for my $host (@hosts) { $pm->start($host) and next; exec(@some_command); }

        Error checking added:

        for my $host (@hosts) { $pm->start($host) and next; exec(@some_command); print(STDERR "exec failed: $!\n"); _exit($!); }

        That also fixes the bug where you'd return zero (success) when the child died from a signal.

        Update: Clean up.

Re: Parallel::ForkManager run_on_finish exit code handler
by derby (Abbot) on Sep 08, 2009 at 15:45 UTC

    Use a closure:

    my $stats = []; my $code = create_closure( $stats ); .... $pm->run_on_finish( $code ); .... print Dumper( $stats ); sub create_closure { my( $stats ) = @_; return sub { my ($pid, $exit_code, $ident) = @_; push( @$stats, { pid => $pid, exit_code => $exit_code, ident => $ident } ); } }

    -derby
      He already is, which is why I asked to see the problem.

        Unless the OP edited the code, his example does not have a closure. He alluded to tracking the exit codes via a HASH but that sounded more like a global than a closure. I believe the OP may be looking for someway to update his database inside the run_on_finish command -- in that case he could just close over the database handle instead of some data structure which could be manipulated *after* the run of all the children.

        Update: ... forgot an (untested) example:

        my $dbh = create_dbh_handle(); my $code = create_closure( $dbh ); .... my $pm = Parallel::ForkManager->new( $MAX_NR_PROCESSES ); $pm->run_on_finish( $code ); foreach ... { my $pid = $pm->start and next; ... $pm->finish; # Terminates the child process } sub create_closure { my( $dbh ) = @_; return sub { my ($pid, $exit_code, $ident) = @_; my $sql = "insert into table values( $pid, $exit_code )"; $dbh->do( $sql ); } }

        -derby