casimo has asked for the wisdom of the Perl Monks concerning the following question:
I am using Parallel::ForkManager to fork WWW:Mechanize to do some crawling. There are some instances where I want the fork to stop based on the content of the html response.
I am having trouble since the fork wants to keep going until all children have been spawned(?).
use WWW::Mechanize; use Parallel::ForkManager qw( ); use HTTP::Cookies; use constant MAX_CHILDREN => 3; { my $mech = WWW::Mechanize->new(timeout => 90); my @urllist = ( "http://site1/", "http://site2/", "http://site3/", "http://site4/", "http://site5/" #etc... ); my $pm = Parallel::ForkManager->new(MAX_CHILDREN); foreach my $url (@urllist) { # Forks and returns the pid for the child. my $pid = $pm->start() and next; $mech->get($url); $content=$mech->content; if ($content =~ m/string/) { #PROBLEM - exit the entire fork } # Exit child. $pm->finish(); } }
|
---|