Forking and database updates

delirium has asked for the wisdom of the Perl Monks concerning the following question:

I'm working on some code to run some comm sessions in parallel, and update a database when all the sessions have completed. The database contains things like filenames, last run times, etc.

The problem I'm running into is that I'm reading the database into a hash, and dumping the hash back to the database at the end of the script. The hash copies in each fork don't update the parent's hash, and I'm left with the hash as it was at load time being re-written. Here is some dumbed-down code to illustrate:

#!/usr/bin/perl -w
use strict;
use Data::Dumper;
use Parallel::ForkManager;

my $pm = new Parallel::ForkManager(10);
$pm->run_on_finish ( sub {
    my (undef, $exit_code, $ident) = @_;
    $update_flag = 1 if $exit_code;
}  );

my %sess_hist = ();

&load_database;

for my $session (keys %{$hash{Session}})    {
    my $pid = $pm->start($session) and next;
    &run_session($session) if &check_overdue($session);
    # &run_session updates %sess_hist with new stats
    else { exit (0); }
    $pm->finish($session);
}
$pm->wait_all_children;

&save_database if $update_flag;
[download]

I'd be more than happy to ditch Data::Dumper in favor of a simple database, but the ones I've played with (NDBM_File, SDBM_File, etc.) all seem to do a final untie() at the end to re-write the database, putting me back in the same boat.

What's a good way out of this with the least amount of module installation?

Thanks.

Update:

Wow, that was really a "Bread good, fire bad" moment. Yes, my "database" is nothing more than a Data::Dumper printout. The easy solution ocurred to me on the drive home: Child processes create a new hash of things to change, then re-read the original hash from file, merge the changes, re-save.

Next time: more caffeine, less knee-jerk question posting.

Comment on Forking and database updates Download Code

Replies are listed 'Best First'.

Re: Forking and database updates
by mpeppler (Vicar) on Nov 11, 2003 at 22:12 UTC

Alternatively, if you want your child process changes to be reflected directly in the parent then you need to use some form of shared memory setup, or by having the parent read data from each of the child processes and update its data that way.

Michael

[reply]

Re: Forking and database updates
by Roger (Parson) on Nov 11, 2003 at 22:56 UTC

serialization

external

our

liz

forks::shared

The "forks::shared" pragma allows a developer to use shared variables with threads (implemented with the "forks" pragma) without having to have a threaded perl, or to even run 5.8.0 or higher.

  use forks;
  use forks::shared;

  my $variable : shared;
  my @array    : shared;
  my %hash     : shared;

  share( $variable );
  share( @array );
  share( %hash );

  lock( $variable );
  cond_wait( $variable );
  cond_signal( $variable );
  cond_broadcast( $variable );
[download]

[reply]
[d/l]

Re: Forking and database updates
by perrin (Chancellor) on Nov 11, 2003 at 22:07 UTC

MLDBM::Sync

[reply]

Re: Forking and database updates
by blue_cowdawg (Monsignor) on Nov 11, 2003 at 21:42 UTC

The hash copies in each fork don't update the parent's hash, and I'm left with the hash as it was at load time being re-written. Here is some dumbed-down code to illustrate:

If I understand your statement correctly I would not expect modifications of data within a child to be "felt" back at the parent. When you fork a child it recieves a copy of the parent's data space at fork. When the child terminates any changes to that data space are lost with the child.

Now, if your parent reads the database after the children update it and doesn't see changes then you have a completely different problem set.

Peter L. Berghold -- Unix Professional Peter at Berghold dot Net
	Dog trainer, dog agility exhibitor, brewer of fine Belgian style ales. Happiness is a warm, tired, contented dog curled up at your side and a good Belgian ale in your chalice.

[reply]