in reply to Re: Storable problem of data sharing in multiprocess
in thread Storable problem of data sharing in multiprocess

Please see the test result by this link: http://www.aobu.net/cgi-bin/test_gseSM.pl. And you can see each child process is doing the same thing without sharing %urls and @unique_urls between them.
  • Comment on Re^2: Storable problem of data sharing in multiprocess

Replies are listed 'Best First'.
Re^3: Storable problem of data sharing in multiprocess (only parent partitions jobs)
by Anonymous Monk on Oct 03, 2014 at 09:34 UTC

    Here is what I would do , partition the url list upfront into different storable files, so when you fork you're only sharing a single filename, then later the parent process unifies the results of the child process ... only parent partitions jobs because only parent spawns children

    #!/usr/bin/perl -- ## ## ## ## perltidy -olq -csc -csci=3 -cscl="sub : BEGIN END " -otr -opr -ce +-nibc -i=4 -pt=0 "-nsak=*" ## perltidy -olq -csc -csci=10 -cscl="sub : BEGIN END if " -otr -opr +-ce -nibc -i=4 -pt=0 "-nsak=*" ## perltidy -olq -csc -csci=10 -cscl="sub : BEGIN END if while " -otr + -opr -ce -nibc -i=4 -pt=0 "-nsak=*" #!/usr/bin/perl -- use strict; use warnings; use Data::Dump qw/ dd /; Main( @ARGV ); exit( 0 ); sub Main { my @files = StorePartitionUrls( GetInitialUniqueUrls() ); ForkThisStuff( @files ); UnifyChildResults( 'Ohmy-unique-hostname-urls-storable', @files ); } ## end sub Main sub GetInitialUniqueUrls { my @urls; ... return \@urls; } ## end sub GetInitialUniqueUrls sub ForkThisStuff { ## spawn kids with one file, wait, whatever for my $file ( @files ) { EachChildGetsItsOwn( $file ); } } ## end sub ForkThisStuff sub ForkThisStuff { for my $file( @files ){ ## something forking here EachChildGetsItsOwn( $file ); } } sub StorePartitionUrls { my( $urls , $partition , $fnamet, ) = @_; $partition ||= 100; $fnamet ||= 'Ohmy-candidate-urls-%d-%d-storable'; my @files; while( @$urls ){ my @hundred = splice @$urls, 0, $partition ; #~ my $file = "Ohmy-".int( @$urls ).'-'.int( @hundred ).'-s +torable'; my $file = sprintf $fnamet, int( @$urls ), int( @hundred ); lock_store \@hundred, $file; push @files, $file; } return @files; } ## end sub StorePartitionUrls __END__
      As jellisi2 said, this script will not work, and also will makes the script more difficult to run because maybe it will crawl a single page many many times.

        As jellisi2 said, this script will not work, and also will makes the script more difficult to run because maybe it will crawl a single page many many times.

        well, jellisi2 did not say that, and its not true anyway, it most definitely will work