Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

forked map

by LanceDeeply (Chaplain)
on Oct 27, 2006 at 21:28 UTC ( [id://581025]=note: print w/replies, xml ) Need Help??


in reply to Using DBM::Deep and Parallel::ForkManager for a generalized parallel hashmap function builder (followup to "reducing pain of parallelization with FP")

hi-

instead of managing all that file locking complexity, why not write the results of each element to its own file. then there's no need for dbm deep. you can persist to different files without worrying on stepping on other results. ( i used freeze/thaw to save complex data structures )

forked_map takes a dir where it temporarily persists results, a max number of workers ( forked children ) so you can tune it as you wish, the function to perform, and the data to perform it on. it uses the same pattern as •Re: Run N similar tasks in parallel
use strict; use warnings; use Data::Dumper; use File::Path; use File::Spec; use Storable qw(freeze thaw); use POSIX ":sys_wait_h"; my $slow_matches_b = sub { sleep 1; return unless $_[0]; return 1 if $_[0] =~ /b/; }; my $test_strings = [ ('blee','blah','bloo', 'qoo', 'fwee' ) ]; my @results = forked_map( 'c:\\testdir', 2, $slow_matches_b, @$test_st +rings ); print Dumper \@results; sub forked_map { my $tempdir = shift; my $worker_count = shift; my $function = shift; # # ensure tempdir is not some file # if ( -f $tempdir ) { die "$tempdir exists!"; } # # delete dir, then recreate ( destroy any previous results ) # unlink $tempdir; File::Path::mkpath( $tempdir ); # # assign keys to each data element # my $id = 0; my %hash_input = map { ( $id++, $_ )} @_; my %workers; # # loop block for assigning work to workers # { # # assign work to available workers # while ( keys %hash_input and keys %workers < $worker_count ) { $id = (keys %hash_input)[0]; my $data = $hash_input{$id}; delete $hash_input{$id}; # # fork the child process # my $pid = fork; if ( ! defined $pid ) { die "cannot fork: $!"; } if ( $pid ) { # # track the pid in the parent process # $workers{$pid} = 'ACTIVE'; } else { # # work in the child process # my $result = $function->($data); my $frozen_result = freeze(\$result); # # save the results # my $tempfile = File::Spec->catfile($tempdir,$id); open(my $fh, "> $tempfile") or die $!; print $fh $frozen_result; # # bye bye baby # exit 0; } } # # wait for any child to complete # my $pid = wait(); if ($pid == -1) { %workers = (); } else { delete $workers{$pid}; } # # loop if there is any work remaining # redo if ( ( keys %hash_input ) or ( keys %workers ) ) } # # read results from file # my %results; opendir(my $dh,$tempdir); for ( readdir($dh) ) { my $filepath = File::Spec->catfile($tempdir,$_); if ( -f $filepath ) { open(my $fh, $filepath ) or die $!; my $frozen_result = do {local $/; <$fh> }; $results{$_} = ${thaw($frozen_result)}; } } # # clean up after ourselves # unlink $tempdir; # # return results in order # return map { $results{$_} } sort keys %results; }

update: cleaned up freeze thaw usage

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://581025]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (3)
As of 2024-04-20 01:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found