ramblinpeck has asked for the wisdom of the Perl Monks concerning the following question:

So I'm creating what amount to a web spider of sorts, and I'm wanting to try to put some bandwidth usage/monitoring in there where I'm not completely saturating the usage on the system running it, and also to keep my downloading threads at a fairly consistant usage. Any modules that I should look at? Running on a debian linux system. Thanks,

Replies are listed 'Best First'.
Re: Thread::Queue Internals/Duplicates
by martin (Friar) on Feb 03, 2006 at 20:55 UTC
    You might want to use a shared hash to do the winnowing, like so (untested):
    use strict; use warnings; use threads; use threads::shared; use Thread::Queue; my $queue = Thread::Queue->new; my %urls = (); share(%urls); my $url = 'http://www.perlmonks.org/'; my $visited = do { lock(%urls); $urls{$url}++ }; if (!$visited) { $queue->enqueue($url); }

    Update: added locking

      Doh, you must have caught me updating :). I figured out the Threads:Queue thing, with pretty much the suggestion you gave. Thanks,