in reply to Re: Re: Fun with threads
in thread Fun with threads

I have to admit to being v.v.confused by what you're aiming to achieve here. You pass data to the threads from getData() , checking to see that $somedata is !null , at which point things should finish. Are these threads CPU intense ? chewing some furry metrics from $somedata ? or do they use $somedata to build an IO pipe like a network socket? In the latter case , 1500 threads (phew) could be munching away occasionally on their socket. 1500 threads trying to churn $somedata . . . I'd never dare -. I'm pretty sure recommendations for perl threads say that fewer==better .


still confused please post more code. I think this is going to bug me and poor linux box all weekend.

Prolly be useful to know what OS, CPU , perl flavours you're stirring with this program too. : )



i can't believe its not psellchecked

Replies are listed 'Best First'.
Re: Re: ^3: Fun with threads
by znu (Acolyte) on Dec 21, 2002 at 07:20 UTC
    Hey, thanks for the concern. I looked at pg's code and i must admit i'm still confused. To me it looks like it's doing what I need it to but hanging after having launched 1020 or so threads. What i'm doing is downloading small chunks of data from the internet. I'm on a fairly fast link, but if I try to do the proccess in serial, it takes ages because of waiting for http request/response. So, i'm launching b/ween 50-100 seperate threads to do the task, this way i'm finding i can make better use of my bandwidth. The plan is to have X number of workers sitting there downloading these small chunks of data, when one of them finishes, a new worker is created to take it's place. What i'm trying to do is probably not the best way to do it, but as I said, it's working nicely apart from hanging after having created 1020 threads. Thanks for your help!

      Slappy GNU Year and all that! :) , the holidays have given me time to tackle this a bit closer. One thing that was worrying me was you seem to be starting one thread per chunk of info from getData() , and replenishing the pool with a new thread each time a worker finishes and joins (have we mentioned that joining to thread->self is v.v.bad).

      So I played around with a pool of 'fulltime' workers that will perform the same action ad-infinitum, a 'foremen' thread and the main thread.

      Using my favorite breeding ground for URL's (squid.access.log) , I grabbed a list of 250 jpg image URLs to use as the data to feed the threads. The foremen takes a line at a time from the file urls and pushes it onto a queue. It might be wise to limit the length of the pending queue for memory's sake ,but I skipped that for this example.

      Worker threads try to grab the next item from the url queue, if the queue is empty then the thread sleeps otherwise it downloads the URL with LWP::Simple , then if $finished has not been set , repeats the loop.

      #!/usr/dev/perl/bin/perl -w use strict; use threads; use threads::shared; use threads::shared::queue; use LWP::Simple; use Data::Dumper; $|=1; my $results = new threads::shared::queue; my $urls = new threads::shared::queue; my $max_threads = 20; my $finished : shared ; my $in : shared; my $out : shared; my $total : shared; $finished=0; $in =0; $out =0; $total=0; # Foreman arrives before workers? threads->new( 'foremen' ); # Start all the workers for (1..$max_threads) { threads->new( 'worker' ) }; # Main Loop; do { my $result = $results->dequeue_nb; if ($result) { $out++; print $result,$/; } else { print "wait: total records $total , results returned $out\n"; +sleep 1 }; if ( $out == ( $total - $max_threads ) ) { $finished = 1 }; } until ($out == $total); # Cleanup print "Waiting for remaining threads to detach/exit\n"; my @threads; do { @threads = threads->list; sleep 1; } until ( 1 == scalar (@threads) ) && print "Exiting\n"; ### Send in the subroutines ### sub foremen { open D , 'urls' || die ' screaming $!'; while ( <D> ) { chomp; $urls->enqueue($_); $in++; }; $total = $in; (threads->self)->detach; } sub worker { do { my $url =$urls->dequeue_nb; if ($url) { $url =~ /([^\/]+)$/; my $file = $1; unless ( $file ) { print "Failed , $url \n" }; my $result = getstore ( $url , $file ); $results->enqueue( "$result|$url" ); } else { sleep 1 } } until ( $finished ); print threads->tid , " - finished, detaching\n"; (threads->self)->detach; }