I've tried setting it up several different ways, but I'm just not getting anything that works yet.
I have been able to successfully write a script that just starts one thread per URL that gets all the sites content in parallel, but I want to be able to limit the number of threads running at any given time in order to keep the script from sucking up all the bandwidth and CPU time as the list of URLs grows.
Here is my non-working example:
use Thread; use Thread::Queue; use LWP::Simple; use strict; my $q = new Thread::Queue; $q->enqueue qw( http://www.slashdot.org/ http://www.freshmeat.net/ http://www.perlmonks.org/ http://www.mozilla.org/ http://dada.perl.it/ http://www.google.com/ http://www.linux.com/ http://www.beachside.net/ http://www.perl.com/ http://www.httptech.com/ ); while ($q->pending) { my $kid; my @threads = Thread->list; my $current = scalar(@threads); if ($current < 6) { my $url = $q->dequeue; print "Retrieving $url\n"; $kid = new Thread(\&get_url, $url); } } $q->enqueue(undef); # Don't know if I even need this part; it has no effect at this point for (Thread->list) { if ($_->tid && !Thread::equal($_, Thread->self)) { $_->join; } } sub get_url { my $url = shift; get($url); print "Retrieved $url\n"; }
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |