Iteration through large array using a N number of forks.

Spesh00 has asked for the wisdom of the Perl Monks concerning the following question:

I have a large list of urls that need to be tested for response time every x number of seconds. I already have a script that iterates through the list, however I would like to fork it out to speed things up.

I came across some code that appears to do what I need to do, and I've trimmed it up for example sake.. However when I run it just with a for loop to test the "iteration" part, I find that each child runs through the entire list, when I really just want the child to run through one of rows in the array -> do it's thing -> and then die..

I've already forkbombed my server once - any help with quickly parsing an array with fork would be greatly appreciated.. The example is below:

#!/usr/bin/perl

#...
$maxfork= 20;         # maximum number of procs

$stime = time();      # get current time in secs since 1970

# --- check if older proc is already running ------
stat $lock;
if (-e _) {
    $curtime = scalar localtime($stime);
    print STDERR "$curtime: LOCKED, skipping.\n";
    exit 1;
}

# nope, so lock it
open LOCK, ">$lock";
print LOCK "$$";
close LOCK;

$loop = 0;
$forked = 0;

for ($count=1; $count<50; $count++){
    if ( $forked > $maxfork) {
        wait;
        $forked--;
    }
    if ( $pid = fork ) {
        $forked++;             # parent
    } elsif ( defined $pid) {  # child
        # The place where I would like to do the work against one of t
+he array cells?
                print "Fork number is $forked - Counter is $count\n";
    } else { 
    die "error forking!";
    }
    $loop++;
}

# wait for child procs
while ( wait != -1) { ; };
[download]

Comment on Iteration through large array using a N number of forks. Download Code

Replies are listed 'Best First'.
Re: Iteration through large array using a N number of forks. by dragonchild (Archbishop) on Feb 22, 2005 at 17:22 UTC
Parallel::ForkManager is the standard solution to this. Being right, does not endow the right to be rude; politeness costs nothing. Being unknowing, is not the same as being stupid. Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence. Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.	[reply]
Re^2: Iteration through large array using a N number of forks. by Spesh00 (Initiate) on Feb 22, 2005 at 18:07 UTC
Unfortuneately, the environment I am doing development in is extremely restrictive, and so adding on perl modules is kind of a last ditch option. I was hoping something in native code might be able to achive this result without having to use them - or if possible, something that comes with 5.8. Thanks for the insight however, and I will investigate it in case I have no other choice.	[reply]
Re^3: Iteration through large array using a N number of forks. by dragonchild (Archbishop) on Feb 22, 2005 at 18:12 UTC
Cut'n'paste the code from Parallel::ForkManager into a module that you "wrote". It's PurePerl. Heck, if you need it, I cut'n'pasted the code into this node (within the readmore). If you feel bad, add an attribution. Read more... (5 kB) Note: a restrictive development environment is one that encourages the development of costly and buggy software. You have a perfectly good solution that was developed under the same development model that Perl itself was developed under. Is that seriously a big issue? Being right, does not endow the right to be rude; politeness costs nothing. Being unknowing, is not the same as being stupid. Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence. Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.	[reply] [d/l]
Re^3: Iteration through large array using a N number of forks. by holli (Abbot) on Feb 22, 2005 at 18:19 UTC
Even in a restrictive environment you can put your own module-directory and install the module there. With simple pure perl modules (no compiler needed) you can simply fetch the module file(s)? and copy it to that directory. Parallel-ForkManager is pure perl, so no problem. You only have to tell perl where to look: `use lib qw(/some/lib/path);` [download] And even in the most restrictive environment you can simply put the modules code in your scriptfile. modules with c-code are more complicated. you´re lost when you don´t have access to a compiler, unless you find it prebuilt. if you have no shell access, these nodes might be helpful: Install Perl Modules Using FTP Without Having Shell Access?, Installing modules without root and shell holli, /regexed monk/	[reply] [d/l]
Re: Iteration through large array using a N number of forks. by zentara (Cardinal) on Feb 22, 2005 at 19:51 UTC
Here is an example which may work for you, based on some snippets Abigail posted awhile ago. My example: #!/usr/bin/perl use warnings; if ($#ARGV < 0){@ARGV = qw(a b c d)} &afork (\@ARGV,4,\&mysub); print "Main says: All done now\n"; sub mysub{ my $x = $_[0]; system "mkdir dir$x"; chdir "dir$x"or die $!; for ($i=1;$i<10;$i++) { system "touch $i-$x"; # open3(OUTPUT, INPUT, ERRORS, cd dir$x;make clean; make all); #<code to process the output of the make commands and store into log +files> } } ################################################## sub afork (\@$&) { my ($data, $max, $code) = @_; my $c = 0; foreach my $data (@$data) { wait unless ++ $c <= $max; die "Fork failed: $!\n" unless defined (my $pid = fork); exit $code -> ($data) unless $pid; } 1 until -1 == wait; } ##################################################### [download] The original old post from Abigail #!/usr/bin/perl #by Abigail of perlmonks.org #Some times you have a need to fork of several children, but you want +to #limit the maximum number of children that are alive at one time. Here + #are two little subroutines that might help you, mfork and afork. They + are very similar. #They take three arguments, #and differ in the first argument. For mfork, the first #argument is a number, indicating how many children should be forked. +For #afork, the first argument is an array - a child will be #forked for each array element. The second argument indicates the maxi +mum #number of children that may be alive at one time. The third argument +is a #code reference; this is the code that will be executed by the child. +One #argument will be given to this code fragment; for mfork it will be an + increasing number, #starting at one. Each next child gets the next number. For afork, the + array element is #passed. Note that this code will assume no other children will be spa +wned, #and that $SIG {CHLD} hasn't been set to IGNORE. mfork (10,10,\&hello); sub hello{print "hello world\n";} print "all done now\n"; ################################################### sub mfork ($$&) { my ($count, $max, $code) = @_; foreach my $c (1 .. $count) { wait unless $c <= $max; die "Fork failed: $!\n" unless defined (my $pid = fork); exit $code -> ($c) unless $pid; } 1 until -1 == wait; } ################################################## sub afork (\@$&) { my ($data, $max, $code) = @_; my $c = 0; foreach my $data (@$data) { wait unless ++ $c <= $max; die "Fork failed: $!\n" unless defined (my $pid = fork); exit $code -> ($data) unless $pid; } 1 until -1 == wait; } ##################################################### [download] I'm not really a human, but I play one on earth. flash japh	[reply] [d/l] [select]
Re^2: Iteration through large array using a N number of forks. by Spesh00 (Initiate) on Feb 22, 2005 at 21:47 UTC
Wow. Excellent, thanks everyone! Extremely appreciated. As a bonus question.. Would it be all that difficult to also simply have those threads sleep instead of dying outright, and then fire back up to reiterate through the file? I'm just pondering what the overhead of having the overall script continually fire up every x minutes and spawning 30 kids, vs just having it sleep with it's 30 kids and then reiterate?	[reply]
Re^3: Iteration through large array using a N number of forks. by BrowserUk (Patriarch) on Feb 22, 2005 at 22:21 UTC
Would it be all that difficult to also simply have those threads sleep instead of dying outright, and then fire back up to reiterate through the file? If you use threads, then that is no problem whatsoever :) Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply]
Re^4: Iteration through large array using a N number of forks. by Spesh00 (Initiate) on Mar 02, 2005 at 18:06 UTC
Re^5: Iteration through large array using a N number of forks. by BrowserUk (Patriarch) on Mar 02, 2005 at 21:17 UTC
Re: Iteration through large array using a N number of forks. by perrin (Chancellor) on Feb 22, 2005 at 19:08 UTC
merlyn has several examples of this on his site. You can find them with this Google search.	[reply]