znu has asked for the wisdom of the Perl Monks concerning the following question:

Hi people. I'm having some trouble with writing a threaded script. I'm trying to write a worker/workerPool program where there is a set amount of worker threads (say 100) working on a similar sort of task, as soon as one finishes, another is launched to maintain 100 simultaneous worker threads. The following snippet works really well... until it's launched around 1060 threads, at which point it gets stuck in the outer while loop. Check out the code, i'm fairly new to threads so i could be doing something drastically wrong. If so, could you please point me in the right direction for implementing the above funcionality. Thanks in advance! Main loop handling workers:
my @running_threads = (); while ($finished eq "false") { @running_threads = threads->list; if (scalar(@running_threads) < $worker_num) { print "launching thread with count:$count\n"; $somedata = getData(); if ($somedata ne "") { threads->new(\&worker, $somedata); } else { $finished = "true"; } } }
And this is the code for the worker:
sub worker { # do whatever eval((threads->self)->join); }

Replies are listed 'Best First'.
Re: Fun with threads
by pg (Canon) on Dec 20, 2002 at 02:33 UTC
    1. You told the thread to stop only after it stopped, which can never be satisfied.
      use threads; $| ++; threads->create(\&a); while (1) {} sub a { eval(threads->self->join);#Waiting for Godot print "before return\n";#will not show up }
      Here is one normal way to use join:
      use threads; $| ++; $child = threads->create(\&a); $child->join; # I am wating the $child thread to finish first print "Main thread stop\n"; sub a { for (1..5) { print "child is still counting, $_\n"; sleep(3); } print "child thread stop\n"; }
    2. Join provides a kind of coarse method to synchronize, just like what waitpid provided for processes. One process or thread stalled until the others caught up and finished. For finer methods of synchronization, please consider lock, condition, and semaphore. Let's use fork and waitpid to do something similar as what we did in point 1:

      This works:
      $| ++; if (($chld = fork()) == 0) { sleep(3); print "child exit\n"; } else { waitpid $chld, 0; print "parent exit\n"; }
      This hangs:
      $| ++; waitpid $$, 0; #wait for Godot print "exit\n";
    3. In real life, you always need to put an up limit on the number of threads you can have. Where is the limitation? it really depends on your context, your every bit of effort tuning your multi-threaded applications will be appreciated.
    In a multi-threading program, always turn on $| for debuging, so you are not confused.
      Hmmm, i see. So do you mean that since it does print "before return" the join seems to have no effect? Does the thread then stop when it reaches the end of the sub?? Also, what does $| ++; do? thanks!
      ++pg PMSL @ script that is 'Waiting for Godot'.
Re: Fun with threads
by submersible_toaster (Chaplain) on Dec 20, 2002 at 03:22 UTC

    In addition to the good advice already posted, you may like to consider a serious remodel of your process. Rather than joining your threads to retrieve their result, you could setup a 'return queue'. Thus you start each worker thread - passing it $somedata and a reference to your return queue such as made by threads::shared::queue. Detach the thread, and forget you ever started it. In your worker sub, ->enqueue your processed return data , and exit the thread.

    I have a snippet that demonstrates simple use of queues. The networking stuff is pretty poor and not related to your problem so best to ignore that.

    This way you have a simple accessor to finished data in your main thread, by ->dequeue 'ing from your queue object. Once the detached worker threads enqueue their result , they can die a natural death (exit). Which should be reflected in the number that threads->list returns. As @threads_running falls below your 'apparent' optimal thread count - top up the pool by starting another thread

    Of course I'm going to run off and try this now, but methinks the theory is ok



    Good Luck!
      Thanks, but does it work for you? The queuing is a good idea and i will implement it at some stage, but i don't think that's my problem at the moment. Essentially, I think i'm doing the same thing as you suggested except for the fact that even when I run a completely stripped down version of the script, i.e each thread does no data proccessing whatsoever, just prints to the screen i get the same result. Furthermore, if I change it so that each thread detaches itself instead of joining itself, I get a funny phenomenon where after around 1500 threads are launched the script dies printing "Killed" to the screen :-/

        I have to admit to being v.v.confused by what you're aiming to achieve here. You pass data to the threads from getData() , checking to see that $somedata is !null , at which point things should finish. Are these threads CPU intense ? chewing some furry metrics from $somedata ? or do they use $somedata to build an IO pipe like a network socket? In the latter case , 1500 threads (phew) could be munching away occasionally on their socket. 1500 threads trying to churn $somedata . . . I'd never dare -. I'm pretty sure recommendations for perl threads say that fewer==better .


        still confused please post more code. I think this is going to bug me and poor linux box all weekend.

        Prolly be useful to know what OS, CPU , perl flavours you're stirring with this program too. : )



        i can't believe its not psellchecked
Re: Fun with threads
by batkins (Chaplain) on Dec 20, 2002 at 02:35 UTC
    100 threads? are you sure that's the best approach?

    be sure that, if you're going to use that many threads, the threads are at least blocking. if you have 100 threads all waiting for input from different sockets, you won't see too much performance loss. but 100 active threads could easily get out of hand.