jalebie has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I am having a problem with ending child processes. I am actually trying to a script on different workstations parallely. The way I am trying to do this is to launch a bunch of child processes that will rsh to each workstation and run the local script. With this approach I seem to be running out of open sockets, because I ended up launching too many child processes at a time. So i tried using the waitpid the command which will wait for 20 processes to end and then launch another batch of 20 once these are done. But for som reason I am not able to control these and keep running out of open sockets. Maybe you monks can take a look at it and suggest what I am doing wrong here.

foreach $host (@hosts) { chomp ($host); my $templocal = "tmpclean.local.prl"; my $cmd_rsh ="/bin/rshto -to 60"; $pid = fork; if ($pid == 0) { # child print "$cmd_rsh $host $templocal 2 \n"; open(PIPEHANDLE, "$cmd_rsh $host $templocal 2 |"); my @output = <PIPEHANDLE>; close PIPEHANDLE; open(OUTFILE, ">> $output_file"); print OUTFILE @output; close OUTFILE; exit 1; } else { push(@pids,$pid); } if ($#pids > 10) { print "more than 10 processes running \n"; my $id = waitpid(-1,0); while (!(WIFEXITED($?)) ) {} $#pids = -1; #sleep 5; } }


am I doing the right thing here ?? All i want to do is launch around 20-50 processes without running out of open sockets. Please Help.

Waris

Replies are listed 'Best First'.
Re: How do you wait for a process to end ?
by VSarkiss (Monsignor) on Aug 23, 2001 at 18:48 UTC

    You're only waiting for one process to end, then removing ten entries in your array of pids. The waitpid call will return status for only one deceased process, but you clear your array of ten pids in response. (BTW, it's clearer to write @pids = () instead of $#pids = -1.)

    Another way to structure this would be to set up a signal handler for SIGCHLD and issue a wait there. Something like this (untested):

    sub reapchild { # already using POSIX, right?? 0 while waitpid(-1, &WNOHANG) > 0; } $SIG{CHLD} = \&reapchild;

    HTH

      Thanks man, I thought that the -1 in the waitpid call would wait for "all" child processes to end. Now its clearer why this is happening. I understand connecting the reaper subroutine to the child , but how would I make it so that I would know that exactly 10 processes have ended ??

      p.s I have never used signal handlers before
      Waris

        No, the -1 says "any" process, not "all processes". Usually you use waitpid to wait for a specific pid (hence the name ;-). But in both cases, it will return upon detecting one terminated child process.

        To create a throttle in the signal handler, you need to make the parent stop when your list of pids reaches ten elements (like you're doing). Then you need to make the signal handler routine (reapchild in the example above) shorten the list every time waitpid returns something other than -1. So:

        sub reapchild { while (waitpid(-1, &WNOHANG) > 0) { pop @pids; } }
        This is quick-and-dirty in that your array of pids may not correspond to what's really out there. It would be better to take the returned pid from waitpid and delete that particular element from the array. TIMTOWTDI, but here's a simple one:
        sub reapchild { my ($pid, $index); PID: while (($pid = waitpid(-1, &WNOHANG)) > 0) { foreach my $i (0..$#pids) { if ($pids[$i] == $pid) { splice @pids, $i, 1; next PID; } } warn "Unexpected $pid returned!\n"; } }

        HTH

        You can just keep a running count of your current children - increment on each successful fork, and decrement on a successful call to \&reaper.
Re: How do you wait for a process to end ?
by claree0 (Hermit) on Aug 23, 2001 at 18:33 UTC

    I think that you need to set up a signal handler for your children exiting -

    $SIG{CHLD} = \&some_subroutine;
    which will allow you to deal with the children as they die.
      Perl does not have reliable signal handling. If it is for production use and has to work day in, day out for an extended time, I would suggest avoiding signal handlers.
        Interesting... I have never heard that, nor had that problem. Is it that way on all *nixes, or just on OSes with weird/broken signals?
Re: How do you wait for a process to end ?
by fokat (Deacon) on Aug 24, 2001 at 03:27 UTC
    I would suggest that you check if you have enough file descriptors available for the operation.

    If memory serves me well, ulimit will help you here. Check your ulimit documentation on how to see (and change) this limit.

    An easy test, would be to do a batch of one or two processes to validate your code.

    You're using waitpid(), but could also do a wait() as well, as you want the call to block there.

    Good luck.