in reply to Parallel::ForkManager How to wait when 1st 10 Child processe are over out of 100

This might work. I didn't run it myself

my $total_children = 1000; my $max_children = 100; my $job_size = 10; my $main_pfm = P::FM->new( $max_children/$job_size ); foreach ( 1..$total_children/$job_size ) { $main_pfm->start and next; run_job( $job_size ); $main_pfm->finish; } sub run_job { my $job_size = shift; my $job_pfm = P::FM->new( $job_size ); foreach ( 1..$job_size ) { $job_pfm->start and next; # job $job_pfm->finish; } }
  • Comment on Re: Parallel::ForkManager How to wait when 1st 10 Child processe are over out of 100
  • Download Code

Replies are listed 'Best First'.
Re^2: Parallel::ForkManager How to wait when 1st 10 Child processe are over out of 100
by runrig (Abbot) on Dec 01, 2011 at 16:45 UTC
    I believe he wants 100 jobs total to run simultaneously, so if the parent runs 10 processes and each child runs 10, then there will be 100 total. You currently have the parent running 100...I don't think he wants 1000 processes.

    Update: my mistake..this is not the post you are looking for...move along

      You are mostly right, in the end he does want to run 1000 jobs, but only have 100 at a time max and in blocks of 10.

      When calling P::FM you declare how many forks you want to run. In the main method:

      my $max_children = 100; my $job_size = 10; my $main_pfm = P::FM->new( $max_children/$job_size );

      So that is 100/10 which is 10 job blocks at the same time. Since the job size is 10, then that is 100 jobs at the same time.

      my $total_children = 1000; my $job_size = 10; foreach ( 1..$total_children/$job_size ) {

      So then we loop through our data (F::PM blocks if there is no avalible slots) 1000/10 is 100. So the main method loops through 100 job blocks while only running 10 blocks at a time.

        I have read a limitation of P:FM that you cannot define new P:FM until all Child processes from the previous are finished. This limitation is mentioned in CPAN for P::FM. So can any one give another suggestions?
Re^2: Parallel::ForkManager How to wait when 1st 10 Child processe are over out of 100
by zwon (Abbot) on Dec 03, 2011 at 03:00 UTC

    If you replace

    # job
    with
    sleep $_;
    you will see that this approach doesn't exactly do what OP wants. It will start next 10 jobs first time after 99 jobs were finished.
      I'm not sure about that, and I'm not on unix at the moment, but a simple sleep won't tell me anything. So for a simple test, I might try this:
      my $total_children = 1000; my @job_list = 1..$total_children; my $max_children = 100; my $job_size = 10; my $main_pfm = P::FM->new( int($max_children/$job_size) ); my $sleep_time; while ( my @batch = splice(@job_list, 0, $job_size) ) { $sleep_time++; $main_pfm->start and next; run_job( @batch ); $main_pfm->finish; } sub run_job { my @jobs = @_; my $job_pfm = P::FM->new( scalar(@jobs) ); foreach ( @jobs ) { $job_pfm->start and next; print "Starting job $_\n"; sleep $sleep_time; $job_pfm->finish; } }
      Not tested. Not even a little bit.

        I was talking about this:

        my $total_children = 1000; my $max_children = 100; my $job_size = 10; my $main_pfm = P::FM->new( $max_children/$job_size ); foreach ( 1..$total_children/$job_size ) { $main_pfm->start and next; run_job( $job_size ); $main_pfm->finish; } sub run_job { my $job_size = shift; my $job_pfm = P::FM->new( $job_size ); foreach ( 1..$job_size ) { $job_pfm->start and next; sleep $_; $job_pfm->finish; } }

        One second after you start this, first ten processes will finish, but all run_job processes will still be active. Program will not start any new processes till one of run_job processes will exit, and this will happen after 10 seconds.

        I see it. He is looking at the case where each "job" is identical overall, but the processes in it are unique across the job.

        Each job has 10 processes each and the time they process is the starting order (as procX) times 1 second. So the first process in the job takes 1 second and the last process takes 10 seconds. When you run 10 of these jobs at the same time you get more or less process procX of all jobs dropping at the same time. Since new jobs are not launched until a previous job is completely emptied then the process count drops real low until a job is finished.

        I think the requirements the OP could could use some improvement, but it does say that "first 10 processes" needs to finish before 10 new are launched.

        PS. If this didn't make it clear, just draw out a timeline on paper and track when something starts and ends while keeping track of # concurrent processes.