http://qs1969.pair.com?node_id=644571

BBQ has asked for the wisdom of the Perl Monks concerning the following question:

So, I wanted to get some monkly feedback on whether this makes sense or not, whether it seems like a reasonable approach to my problem. This is my first incursion into the beautiful work of threads and everything seems to work as intended but hey, peer review wouldn't hurt.

I wrote a subroutine that uses WWW::Mechanize to simulate user activity on a few pages / web forms. I have an input file with a few thousand records that passes parameters to the said subroutine, but I wanted to have 5 concurrent sessions and no more than 5. After doing a little reading on-line, it seemed like threads was a good candidate for this type of scenario so I wrote the following test script as a proof of concept.

It seems to work well but as they say, there's always more than one way of doing things. Oh, and I'm running ActiveState Perl 5.8.8 (build 822) on Windows XP.

Any takers?
#!/usr/bin/perl -w use strict; use threads; my $maxThreads = 5; open(LOGGER,'>>logfile') or die(); for my $num (1..50) { print "Starting thread # $num\n"; my $thread = async { &doStuff($num); }; while (threads->list(threads::running) > $maxThreads) { for my $thread (threads->list(threads::all)) { if ($thread->is_joinable()) { print $thread->join(); } } } } print "Waiting for the last ". threads->list(threads::running) ." thre +ads to finish\n"; while (threads->list(threads::all)) { for my $thread (threads->list(threads::joinable)) { print $thread->join(); } } close LOGGER; sleep(1); sub doStuff { my $num = shift; my $secs = int(rand(3)); print LOGGER "Thread # $num is napping for $secs seconds.\n"; sleep $secs; print LOGGER "Thread # $num is awake.\n"; return "This returned from thread # $num\n"; }

Replies are listed 'Best First'.
Re: Controlling Thread Numbers
by NetWallah (Canon) on Oct 13, 2007 at 01:21 UTC
    Rather than starting/stopping threads to feed input, it would be preferable to feed them work to do. Here is a snipped of the code I use (In production on Windows/perl 58/ Activestate):
    use threads; use Thread::Queue; our $KIDS = $count > 20? 10 : int ($count / 4) || 1; # Some way to c +ontrol max # of threads my $work_Q = new Thread::Queue; ## Start the kids my @kids = map{ threads->create( \&kid, $_) } 1 .. $KIDS; $work_Q->enqueue( keys %ServerInfo ); # Feed the QUEUE with work to do + .... ## Tell them to stop $work_Q->enqueue( (undef) x $KIDS ); # THis tells each thread to QUIT ## And wait for Kids to finish.. my %results = map {$_->join} @kids; #---------------------------------------------------- sub kid { my $kidnumber = shift; my @results=(); my $tid = threads->tid; my $count=0; my $pinger = Net::Ping->new("icmp",$TimeOut); #printf "Kid: %02d started<br>\n", $tid; while( my $work = $work_Q->dequeue ) { $count ++; push @results, $work, Ping_it($pinger, $work) ; } push @results,"KID$kidnumber",$count; #print "kid: $tid ending after processing $count items.<br>\n"; return @results; }

         "As you get older three things happen. The first is your memory goes, and I can't remember the other two... " - Sir Norman Wisdom

      Hey NetWallah,

      Thanks for the snippet!

      I'm having a hard time following though... on this line:

      our $KIDS   = $count > 20? 10 : int ($count / 4) || 1;

      Won't $KIDS always be assigned the value 1? Unless $count has already been assigned a value elsewhere... and I'm assuming that the keys in %ServerInfo are IP addresses that get passed to another sub_routine called Ping_it()?

      Sorry if I'm being a little dense here. Just trying to get what goes where and there seem to be a few pieces of the puzzle still missing.

      Thanks again!
        Sorry - the snippet was indeed incomplete without an explanation of external variables.

        "$count" comes from the number of servers I was processing - Under test conditions, it would be well under 20, and I would do all work in less than 5 sub processes/threads (Only one thread, if $count was less than 5). Under "Normal" conditions, it was in the 200 range, resulting in 10 threads.

        The %serverinfo was a HOH had other info besides IP - Server Name, and related info. I was laid off work on Tuesday 10/16, and no longer have access to the code I wrote, or I would gladly share it.

             "As you get older three things happen. The first is your memory goes, and I can't remember the other two... " - Sir Norman Wisdom