anshumangoyal has asked for the wisdom of the Perl Monks concerning the following question:

I am making a Load Script in which system has to fork 100 Processes, wait when 1st 10 processes are complete/Dropped/Terminated and start another 10 Child Process till the total child pumped reached 1000. I am using Parallel:ForkManager and there is a functionality which I dont know how to use. That is wait_children, wait_one_child. This is a snippet and not complete code. The last if has to wait till 10 child processes are finish and then start another 10 child process. I dont know how to wait for 10 child processed to finish. Here is the code I am writing
#Execute 10 number of calls foreach (1..$cps) { #Child Process Starts Here my $pid = $pm->start and next; if (defined $proxy) { $ENV{http_proxy}="$proxy"; } `curl -o /dev/null -m 222 \"$Link\" 2>&1`; #child process ends here $pm->finish; } #wait for 10 calls to terminate and then pump another 10 calls. if ($callsRunning == $calls_at_a_time) { $pm->wait_children($cps); $callsRunning -= $cps; } }
  • Comment on Parallel::ForkManager How to wait when 1st 10 Child processe are over out of 100
  • Download Code

Replies are listed 'Best First'.
Re: Parallel::ForkManager How to wait when 1st 10 Child processe are over out of 100
by runrig (Abbot) on Dec 01, 2011 at 16:04 UTC
    Do you really need to wait until a batch of 10 processes is finished before you start 10 more, or can you just keep 10 (or 100) running simultaneously, and start another process when one finishes. The latter is simple with Parallel::ForkManager, the former might be simpler without Parallel::ForkManager, but could probably be done by creating a P::FM object that will run 10 processes at a time, but passing 10 jobs to each process, then using a new Parallel::Forkmanager object in the child process to run the 10 jobs simultaneously, thereby having 100 processes run simultaneously at any one time.
Re: Parallel::ForkManager How to wait when 1st 10 Child processe are over out of 100
by admiral_grinder (Pilgrim) on Dec 01, 2011 at 16:28 UTC

    This might work. I didn't run it myself

    my $total_children = 1000; my $max_children = 100; my $job_size = 10; my $main_pfm = P::FM->new( $max_children/$job_size ); foreach ( 1..$total_children/$job_size ) { $main_pfm->start and next; run_job( $job_size ); $main_pfm->finish; } sub run_job { my $job_size = shift; my $job_pfm = P::FM->new( $job_size ); foreach ( 1..$job_size ) { $job_pfm->start and next; # job $job_pfm->finish; } }
      I believe he wants 100 jobs total to run simultaneously, so if the parent runs 10 processes and each child runs 10, then there will be 100 total. You currently have the parent running 100...I don't think he wants 1000 processes.

      Update: my mistake..this is not the post you are looking for...move along

        You are mostly right, in the end he does want to run 1000 jobs, but only have 100 at a time max and in blocks of 10.

        When calling P::FM you declare how many forks you want to run. In the main method:

        my $max_children = 100; my $job_size = 10; my $main_pfm = P::FM->new( $max_children/$job_size );

        So that is 100/10 which is 10 job blocks at the same time. Since the job size is 10, then that is 100 jobs at the same time.

        my $total_children = 1000; my $job_size = 10; foreach ( 1..$total_children/$job_size ) {

        So then we loop through our data (F::PM blocks if there is no avalible slots) 1000/10 is 100. So the main method loops through 100 job blocks while only running 10 blocks at a time.

      If you replace

      # job
      with
      sleep $_;
      you will see that this approach doesn't exactly do what OP wants. It will start next 10 jobs first time after 99 jobs were finished.
        I'm not sure about that, and I'm not on unix at the moment, but a simple sleep won't tell me anything. So for a simple test, I might try this:
        my $total_children = 1000; my @job_list = 1..$total_children; my $max_children = 100; my $job_size = 10; my $main_pfm = P::FM->new( int($max_children/$job_size) ); my $sleep_time; while ( my @batch = splice(@job_list, 0, $job_size) ) { $sleep_time++; $main_pfm->start and next; run_job( @batch ); $main_pfm->finish; } sub run_job { my @jobs = @_; my $job_pfm = P::FM->new( scalar(@jobs) ); foreach ( @jobs ) { $job_pfm->start and next; print "Starting job $_\n"; sleep $sleep_time; $job_pfm->finish; } }
        Not tested. Not even a little bit.
Re: Parallel::ForkManager How to wait when 1st 10 Child processe are over out of 100
by zwon (Abbot) on Dec 01, 2011 at 15:45 UTC

    Start 10 processes, then

    $pm->wait_all_children;
      No, I think what he is trying to do is have 100 children running at the same time max. But when 10 children drop out, start up another 10 until he has gone through 1000 children.

        Ah, I see. Then something like this may help (not tested):

        $pm->run_on_finish( sub { state $counter; start_next_10_processes() unless ++$counter % 10; } );

        I hope not. That simply delays the start of 900 children for no gain. Normally, P::FM works as follows:

        Have 100 children running at the same time max. But when 1 child drops out, start up another until it has gone through 1000 children.

Re: Parallel::ForkManager How to wait when 1st 10 Child processe are over out of 100
by Tanktalus (Canon) on Dec 09, 2011 at 00:27 UTC

    I know you're using P::FM. However, I have to wonder if this would be easier to do with AnyEvent. I'm not entirely sure, as after about 10 minutes, I still don't have example code quite working :-) but maybe it's something to start with.

    use 5.14.0; use EV; use AnyEvent; use AnyEvent::Util qw(run_cmd); use File::Spec; # set up my condvars my $all_done = AE::cv; $all_done->begin for 1..50; my $ten = AE::cv; # start the first 10... start_ten($ten, $all_done, $_) for 1..2; # wait for all of them to finish. $all_done->recv(); sub start_ten { my $t = shift; my $all = shift; my $one = shift; state $started = 0; # 10/100 too much for my machine... for (1..5) { $started ++; my $proc = $started; $t->begin if $one == 1; $all->end; print "Starting...$proc\n"; my $cv = run_cmd( ['sleep', rand(3)], '>' => File::Spec->devnull(), '2>' => File::Spec->devnull(), '<' => File::Spec->devnull(), ); $cv->cb( sub { my $c = shift; my $rc = $c->recv(); print "Finished $proc: $rc\n"; $t->end() }); } # when we're done, start the next ten. $t->cb( sub { my $c = shift; print "Checking next group...($started)\n"; $c->recv; $all->ready() or start_ten($t, $all, 1) }); }
    There is some tweaking required - this will start the first two groups of five, and wait - once the first five come in, it goes and starts the rest. It's not waiting properly. There is also AnyEvent::Util::fork_call which is quite similar to what you want to do, and you may be able to adapt its methods to what you want to do while still leaving much of the heavy lifting to AnyEvent.

    Note that 100 processes being kicked off can do serious damage to your system's responsiveness. Hopefully you have the requisite CPU/RAM :-)

Re: Parallel::ForkManager How to wait when 1st 10 Child processe are over out of 100
by locked_user sundialsvc4 (Abbot) on Dec 05, 2011 at 13:23 UTC

    I have consistently found it most advantageous to start a fixed number of processes, then to have each of them consume “work to do” until there is no more work to be done.   (Then, they die off.)

    For example, “launch 10 processes which, using a simple loop, each do (or execute...) a particular unit of work 100 times in a row.”   Or, as the case may be, they do the unit of work while() the value of some atomic global counter is greater than zero; or until some file or queue is empty.

    A mechanism devised in this way is ... simple to construct; has an obvious and convenient “throttle;” and is entirely self-regulating.   Many human workflows are constructed in just this way.