Sequential processing with fork.

Kelicula has asked for the wisdom of the Perl Monks concerning the following question:

Greetings Perl Monks!

I am NOT new to Perl but I am VERY new to "forking" or "threading". What I am trying to do is have a certain number of processes running continuously and when one finishes another one takes it's place... I almost had it this morning then after trying SO MANY different things, I can't even find the original code!! ARG.... But it's driving me crazy.. I know I am so close...

So let's say I have 33 clients, and I want to process 10 of them at a time, I open one then the next, etc until I reach 10, then I want to wait until "any" one of those ten is finished, then start client 11, and so on so forth...

Here's what I have so far...

#!/usr/bin/perl

use feature qw/say/;


my $count = 1;

CLIENT:
for my $i (1..33){
  
  if ( $count > 10 ) {
    
    say "waiting for open process";
    while (1) {
      
      if ( wait() ){
        $count = 1;
        redo CLIENT;
      }
    }
  }
  else{
    
    # Stagger the initiating to help CPU
    sleep(5);
  }
  
  $count++ and next if( my $pid = fork() );
  
  unless( $pid ){
    
  say "Processing client $i process count $count";
  # Emulate the time it would take to process..
  sleep(60);
  
  exit;
    
  }
}
[download]

The out put I'm getting is:

Processing client 1 process count 1
Processing client 2 process count 2
waiting for open process
Processing client 3 process count 3
Processing client 4 process count 4
Processing client 5 process count 5
Processing client 6 process count 6
Processing client 7 process count 7
Processing client 8 process count 8
Processing client 9 process count 9
Processing client 10 process count 10
waiting for open process
Processing client 11 process count 1
Processing client 12 process count 2
Processing client 13 process count 3
Processing client 14 process count 4
Processing client 15 process count 5
Processing client 16 process count 6
Processing client 17 process count 7
Processing client 18 process count 8
Processing client 19 process count 9
waiting for open process
Processing client 20 process count 10
Processing client 21 process count 1
Processing client 22 process count 2
Processing client 23 process count 3
Processing client 24 process count 4
Processing client 25 process count 5
Processing client 26 process count 6
Processing client 27 process count 7
Processing client 28 process count 8
Processing client 29 process count 9
Processing client 30 process count 10
Processing client 31 process count 1
Processing client 32 process count 2
Processing client 33 process count 3
[download]

What's going on here?

What am I missing?

I've also tried implementing it with Parallel:ForkManager and Thread::Queue ... Just seems that I'm missing something very simple yet elementary.

Any help would be greatly appreciated!! All I doing now is opening 10 clients and waiting an hour to open the next set using:

system( "pathtoperlfile", "args");

In a loop, but it's wasting SO MUCH time, I think I could increase my productivity by 2-3 fold if I could maintain a constant number of processes...

I humbly pray for the wisdom of the Monks...

UPDATE: I'm running this on Windows Server 2012 with Dwimperl. I've been informed that fork is "emulated" on windows and that server may kill long running processes that are just "waiting". Any idea how to handle this? Kelicula
~~~~~~~~~ I'm unique, just like everybody else! ~~~~~~~~~

Comment on Sequential processing with fork. Select or Download Code

Replies are listed 'Best First'.
Re: Sequential processing with fork. by stevieb (Canon) on Aug 04, 2015 at 19:33 UTC
Here's an example using `Parallel::ForkManager` that has come in very handy lately for these types of questions. It'll process `$max_forks` at a time. In the `for` loop, I've put in the number of clients (33). It'll process 10 at a time, calling `do_something()` for each one until all 33 are exhausted. #!/usr/bin/perl use warnings; use strict; use Parallel::ForkManager; my $max_forks = 10; my $fork = new Parallel::ForkManager($max_forks); # on start callback $fork->run_on_start( sub { my $pid = shift; } ); # on finish callback $fork->run_on_finish( sub { my ($pid, $exit, $ident, $signal, $core) = @_; if ($core){ print "PID $pid core dumped.\n"; } } ); # forking code for my $client (1..33){ $fork->start and next; do_something($client); sleep(2); $fork->finish; } sub do_something { my $client = shift; print "$client\n"; } $fork->wait_all_children; [download] -stevieb UPDATE: I can't say for sure, but after removing the `sleep` statement, the output hints at the fact that it'll add more into the queue before previous ones are finished as long as the max count doesn't go over 10. I'm not 100% sure of this though. UPDATE 2: According to the Parallel::ForkManager docs, it does indeed throw another proc onto the heap after each one finishes. This number is configurable: `wait_for_available_procs( $n ) Wait until $n available process slots are available. If $n is not +given, defaults to 1.` [download]	[reply] [d/l] [select]
Re^2: Sequential processing with fork. by ikegami (Patriarch) on Aug 04, 2015 at 20:03 UTC
That's the most useless `run_on_finish` possible. Worse than none at all. #!/usr/bin/perl use warnings; use strict; use Parallel::ForkManager qw( ); use constant MAX_WORKERS => 10; sub work { my ($client) = @_; print("$client start...\n"); sleep(3 + int(rand(2))); print("$client done.\n"); } { my $pm = new Parallel::ForkManager(MAX_WORKERS); # Optional. $pm->run_on_finish(sub { my ($pid, $exit, $ident, $signal, $core) = @_; if ($signal) { print("Client $ident killed by signal $signal.\n"); } elsif ($exit) { print("Client $ident exited with error $error.\n"); } else { print("Client $ident completed successfully.\n"); } }); for my $client (1..33){ $pm->start($client) and next; work($client); $pm->finish(); } $pm->wait_all_children(); } [download]	[reply] [d/l] [select]
Re^3: Sequential processing with fork. by stevieb (Canon) on Aug 04, 2015 at 20:11 UTC
Yes, I realized I had emptied it out previously after the fact. I'll leave it as is so this post retains context. Thanks for pointing it out.	[reply]
Re^2: Sequential processing with fork. by Kelicula (Novice) on Aug 04, 2015 at 20:38 UTC
Yes, that's exactly what I want...as soon as ONE process ends, start another one!! I just can't ever have more than ten running simultaneously... ~~~~~~~~~ I'm unique, just like everybody else! ~~~~~~~~~	[reply]
Re^2: Sequential processing with fork. by Kelicula (Novice) on Aug 08, 2015 at 18:49 UTC
Stevieb, Thank you! Your code seemed to point me in the right direction and do just what I needed, (after trying several other examples) the only problem is for what ever reason it doesn't progress to the next process after the initial group is started. As I mentioned earlier I'm actually calling on another file and passing it the client ID via @ARGV, so it's not like it's simply a lexical scope that's being forked. I've tried using exec, system to no avail. After one of the original group processes ends, it doesn't start another one to replace it or create a new one to run the next client. After all the initial processes finish the main program just ends. :-( Any idea what could be going on? Here's what I'm working with.. #!/usr/bin/perl use warnings; use strict; use Uber qw / client_list /; use Parallel::ForkManager; my $max_forks = 2; # Changed to 2 just to see if it would go to the ne +xt one... my $clients = client_list(); my $fork = new Parallel::ForkManager($max_forks); for my $client ( @$clients ){ $fork->start( $client->{id} ) and next; do_something( $client->{id} ); $fork->finish; } sub do_something { my $client = shift; system( "perl", "C:/Path/To/Folder/process.pl", "$client" ); } $fork->wait_all_children [download]	[reply] [d/l]
Re^2: Sequential processing with fork. by Kelicula (Novice) on Aug 04, 2015 at 19:56 UTC
Thank you for the example, but yes, my system can only handle around ten running at a time. ~~~~~~~~~ I'm unique, just like everybody else! ~~~~~~~~~	[reply]
Re^3: Sequential processing with fork. by stevieb (Canon) on Aug 04, 2015 at 20:05 UTC
Then my code does exactly what you need. You can modify or even remove the `$fork->run_on_*` subs if you don't need the PIDs or don't care if any of the procs failed.	[reply] [d/l]
Re: Sequential processing with fork. by SuicideJunkie (Vicar) on Aug 04, 2015 at 19:41 UTC
It looks like you've got clients forking more clients, and that goes out of control fast. ISTM that what you want is: A master routine to: Fork off ten clients Add to the queue of tasks until done Queue up ten special 'terminate' tasks A client routine (ten copies made by the master) to: Shift a task out of the queue Do the task Terminate if the task is a special terminate task Sleep if there are no tasks in the queue at the moment Keeping your forked code sterile will help prevent you from accidentally making a fork bomb.	[reply]
Re^2: Sequential processing with fork. by stevieb (Canon) on Aug 04, 2015 at 19:45 UTC
"It looks like you've got clients forking more clients, and that goes out of control fast." I laughed out loud quite loudly in my office as soon as I read this for some reason. Nice :)	[reply]
Re^3: Sequential processing with fork. by Kelicula (Novice) on Aug 04, 2015 at 19:54 UTC
Now if I could get all my clients to fork each other, the problem would take care of itself aye? Haha ~~~~~~~~~ I'm unique, just like everybody else! ~~~~~~~~~	[reply]
Re^2: Sequential processing with fork. by Kelicula (Novice) on Aug 04, 2015 at 20:33 UTC
Yes, I would definitely not want a fork bomb. The process that runs for each client is actually a 3000 line script that uses .NET to automate Internet Explorer in "private" mode, (so they can all run together without cookie_jar issues) So there's no room for more processes than needed. As said previously I've been just checking the count and sleeping for roughly an hour to let all 10 finish, then starting the next group, but I often see only 1 or 2 left running and realized I could significantly speed it up by ensuring there were ALWAYS 10 running.. I'll incorporate all you guys advice and post the results to show solution. Thank you!! Oh yeah, I obviously don't want to start my process with system() anymore because that creates another fork right? Could I just use backticks to call the other script and pass the client ID as $ARGV[0]? Any thoughts?... ~~~~~~~~~ I'm unique, just like everybody else! ~~~~~~~~~	[reply]
Re^3: Sequential processing with fork. by Anonymous Monk on Aug 04, 2015 at 21:06 UTC
"exec" it	[reply]
Re: Sequential processing with fork. by Anonymous Monk on Aug 04, 2015 at 21:22 UTC
What's going on here? Too much code! It's simpler than that :) `#!/usr/bin/perl # http://perlmonks.org/?node_id=1137416 use strict; use warnings; $\| = 1; for my $i (1..33) { $i > 10 and warn("waiting...\n"), wait; fork or warn("client $i started\n"), sleep(60), die("client $i ended +\n"); sleep 5; } 1 while wait > 0; # reap the rest warn "all clients finished\n";` [download] replace the "fork or warn..." with: fork or exec("yourprocess"), die "exec failed $!";	[reply] [d/l]
Re^2: Sequential processing with fork. by Kelicula (Novice) on Aug 04, 2015 at 23:21 UTC
Dood! That is *exactly* what I need and thanks for the tip about exec, I'm gonna make a working model and post results... ~~~~~~~~~ I'm unique, just like everybody else! ~~~~~~~~~	[reply]
Re: Sequential processing with fork. by Anonymous Monk on Aug 04, 2015 at 21:43 UTC
By far the easiest thing to do is to put your 33 requests into a shared queue, then spawn however-many workers (10) you need to have. Each worker pops a request off the queue until there are no more; then, it dies.	[reply]
Re^2: Sequential processing with fork. by Anonymous Monk on Aug 04, 2015 at 21:56 UTC
Words, words, words. If it was that easy you would have shown runable code. :)	[reply]
Re^3: Sequential processing with fork. by ikegami (Patriarch) on Aug 05, 2015 at 18:11 UTC
Quite easy. `use forks; use Thread::Queue qw( ); # 3.01+ use constant NUM_WORKERS => 10; sub work { my ($client) = @_; print("$client start...\n"); sleep(3 + int(rand(2))); print("$client done.\n"); } { my $q = Thread::Queue->new(); for (1..NUM_WORKERS) { async { while (my $client = $q->dequeue()) { work($client); } }; } $q->enqueue($_) for 1..33; $q->end(); $_->join() for forks->list(); }` [download]	[reply] [d/l]
Re^4: Sequential processing with fork. by Anonymous Monk on Aug 06, 2015 at 04:21 UTC
Re^5: Sequential processing with fork. by Anonymous Monk on Aug 06, 2015 at 06:43 UTC
Re^4: Sequential processing with fork. by Kelicula (Novice) on Aug 08, 2015 at 16:35 UTC
Re^5: Sequential processing with fork. by Anonymous Monk on Aug 09, 2015 at 01:49 UTC
Some notes below your chosen depth have not been shown here
Re^4: Sequential processing with fork. by Kelicula (Novice) on Aug 09, 2015 at 03:45 UTC
Re^5: Sequential processing with fork. by Anonymous Monk on Aug 09, 2015 at 03:53 UTC