multithreads newbie question

daverave has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: multithreads newbie question by roboticus (Chancellor) on Aug 10, 2010 at 12:46 UTC
daverave: There are many ways you can manage code in threads. Since you have a bunch of dependency relationships in your subroutines, though, I'd suggest a worker/task model. In it, you have a set of worker threads, and they simply pull jobs off the queue. Your queueing logic needs to handle the dependencies somewhere. I'd suggest something simple like: my $quit = 0; my %tasks = ( sub_1 => { code=>\&sub_1, state=>'idle', deps=>[ ] }, sub_2 => { code=>\&sub_2, state=>'idle', deps=>[ ] }, sub_3 => { code=>\&sub_3, state=>'idle', deps=>[ 'sub_1' ] }, sub_4 => { code=>\&sub_4, state=>'idle', deps=>[ 'sub_1' ] }, sub_5 => { code=>\&sub_5, state=>'idle', deps=>[ 'sub_2' ] }, sub_6 => { code=>\&sub_6, state=>'idle', deps=>[ 'sub_2' ] }, sub_7 => { code=>\&sub_7, state=>'idle', deps=>[ 'sub_1', 'sub_2' ] }, ); sub sleep { # No work to do, so doze for a while ... } sub get_next_task { my $task; for my $cur (keys %tasks) { next unless $tasks{$cur}{state} eq 'idle'; my $deps_not_ready = 0; for my $dep (@{$tasks{$cur}{deps}}) { if ($tasks{$dep}{state} ne 'done') { ++$deps_not_ready; } } next unless $deps_not_ready == 0; return $task; } } sub thread { while (! $quit) { my $task = get_next_task(); if (defined $task) { $tasks{$task}{state}='busy'; $tasks{$task}{started}=time; if (&{$tasks{$task}{code}}()) { $tasks{$task}{state}='done'; } else { $tasks{$task}{state}='FAULT'; } $tasks{$task}{finished}=time; } else { # No tasks are available right now sleep(); } } } [download] The preceding (untested!) code is just a description of how I'd approach your problem. It's untested and has a race condition in it: Specifically, if a task switch happens at an inopportune time, then multiple threads could start processing the same task. You'll need to put an interlock (such as a mutex) in there somewhere. (For simplicity, I'd put something like a spinlock at the top of get_next_task and allow only one thread at a time to get a task from the list.) If your dependency tree is complete, you can even reduce the set of tasks. There's no reason you couldn't have the first task execute sub_1, sub_3 and then sub_4, for example. That would remove two entries from %tasks. The reasons I like this particular approach are: You can easily adjust the number of threads without worrying about rearranging the dependency tree. If you use some persistence, you can even pause/stop the scheduler and resume later. I've already implemented it once, so I can reuse some old code... ;^) Editing the dependency tree is simple. Since the task states are maintained in the structure, it's easy to build a reporting screen showing the progress of the system. If you use this idea, feel free to post the finished code when you're done. That way, I can use it in the future. (My version was in C#, and it might be handy to have it in perl some time...) ...roboticus	[reply] [d/l]
Re^2: multithreads newbie question by BrowserUk (Patriarch) on Aug 10, 2010 at 12:54 UTC
Oh those good ol' bad ol' days :)	[reply]
Re^2: multithreads newbie question by daverave (Scribe) on Aug 10, 2010 at 13:24 UTC
roboticus, I wanted to thank you for your kind attention. I'm going with BrowserUK solution since it seems to fit me demands and be simpler but it was really nice to learn how to use a worker/task model. Thanks!	[reply]
Re: multithreads newbie question by BrowserUk (Patriarch) on Aug 10, 2010 at 12:51 UTC
This meets the spec, though probably not your requirements: #! perl -slw use 5.010; use strict; use threads; sub sub1 { say"sub1 starts"; say("sub1:$_"),sleep 1 for 1..3; say"sub1 + ends" } sub sub2 { say"sub2 starts"; say("sub2:$_"),sleep 1 for 1..3; say"sub2 + ends" } sub sub3 { say"sub3 starts"; say("sub3:$_"),sleep 1 for 1..3; say"sub3 + ends" } sub sub4 { say"sub4 starts"; say("sub4:$_"),sleep 1 for 1..3; say"sub4 + ends" } sub sub5 { say"sub5 starts"; say("sub5:$_"),sleep 1 for 1..3; say"sub5 + ends" } sub sub6 { say"sub6 starts"; say("sub6:$_"),sleep 1 for 1..3; say"sub6 + ends" } sub sub7 { say"sub7 starts"; say("sub7:$_"),sleep 1 for 1..3; say"sub7 + ends" } sub sub8 { say"sub8 starts"; say("sub8:$_"),sleep 1 for 1..3; say"sub8 + ends" } my $t1 = async{ sub1(); async { sub3(); }->detach; sub4(); }; my $t2 = async{ sub2(); async{ sub5(); }->detach; sub6(); }; $_->join for $t1, $t2; my $t3 = async{ sub7() }; sub8(); $t3->join; print "main ends"; __END__ c:\test>854022.pl sub1 starts sub1:1 sub2 starts sub2:1 sub1:2 sub2:2 sub1:3 sub2:3 sub1 ends sub4 starts sub3 starts sub4:1 sub3:1 sub2 ends sub6 starts sub6:1 sub5 starts sub5:1 sub4:2 sub3:2 sub6:2 sub5:2 sub3:3 sub4:3 sub6:3 sub5:3 sub4 ends sub3 ends sub6 ends sub5 ends sub8 starts sub8:1 sub7 starts sub7:1 sub8:2 sub7:2 sub8:3 sub7:3 sub8 ends sub7 ends main ends [download] Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP an inspiration; A true Folk's Guy	[reply] [d/l]
Re^2: multithreads newbie question by roboticus (Chancellor) on Aug 10, 2010 at 13:01 UTC
BrowserUk: Perhaps I should remove the phrase "something simple like this" from my post, as your example is complete and simpler. I still like mine for the flexibility, though, but yours is a better simple example of how to solve the problem. ...roboticus	[reply]
Re^3: multithreads newbie question by BrowserUk (Patriarch) on Aug 10, 2010 at 13:13 UTC
I think the main advantage (I see), of mine over yours is that there can be no race conditions, priority inversions, deadlocks, or any of the other nasties that thread naysayers like to run on about. People tend to scoff when I suggest the best way to avoid such things is to not program them. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP an inspiration; A true Folk's Guy	[reply]
Re^4: multithreads newbie question by roboticus (Chancellor) on Aug 10, 2010 at 13:51 UTC
Re^2: multithreads newbie question by daverave (Scribe) on Aug 10, 2010 at 13:21 UTC
Thank you so very much. I think this does meet the requirements (why not)? Just one thing I would like to make sure: when I use the main part of your code ($t1, $t2 and $t3) as a body of a function, and the function returns, do I know for sure that all threads are done? I read the documentation for `detach` and it scared me a little bit :)	[reply] [d/l]
Re^3: multithreads newbie question by BrowserUk (Patriarch) on Aug 10, 2010 at 13:46 UTC
I read the documentation for detach and it scared me a little bit :) So avoid detach and use a couple more joins: #! perl -slw use 5.010; use strict; use threads; sub sub1 { say"sub1 starts"; say("sub1:$_"),sleep 1 for 1..3; say"sub1 + ends" } sub sub2 { say"sub2 starts"; say("sub2:$_"),sleep 1 for 1..3; say"sub2 + ends" } sub sub3 { say"sub3 starts"; say("sub3:$_"),sleep 1 for 1..3; say"sub3 + ends" } sub sub4 { say"sub4 starts"; say("sub4:$_"),sleep 1 for 1..3; say"sub4 + ends" } sub sub5 { say"sub5 starts"; say("sub5:$_"),sleep 1 for 1..3; say"sub5 + ends" } sub sub6 { say"sub6 starts"; say("sub6:$_"),sleep 1 for 1..3; say"sub6 + ends" } sub sub7 { say"sub7 starts"; say("sub7:$_"),sleep 1 for 1..3; say"sub7 + ends" } sub sub8 { say"sub8 starts"; say("sub8:$_"),sleep 1 for 1..3; say"sub8 + ends" } my $t1 = async{ sub1(); my $t = async { sub3(); }; sub4(); $t->join; }; my $t2 = async{ sub2(); my $t = async{ sub5(); }; sub6(); $t->join }; $_->join for $t1, $t2; my $t3 = async{ sub7() }; sub8(); $t3->join; print "main ends"; [download] Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. RIP an inspiration; A true Folk's Guy	[reply] [d/l]
Re^4: multithreads newbie question by daverave (Scribe) on Aug 10, 2010 at 14:08 UTC
Re^5: multithreads newbie question by BrowserUk (Patriarch) on Aug 10, 2010 at 16:07 UTC
Re: multithreads newbie question by ikegami (Patriarch) on Aug 10, 2010 at 14:24 UTC
`join` only waits for one thread to end, whereas you want to know when the first of two threads to end. I used Thread::Queue to achieve that goal. use strict; use warnings; use threads; use Thread::Queue qw( ); my $q = Thread::Queue->new(); my $t1 = async { sub1(); $q->enqueue(1); }; my $t2 = async { sub2(); $q->enqueue(2); }; my $pending = 2; my @threads; for (;;) { my $id = $q->dequeue(); if ($id == 1) { $t1->join(); push @threads, async { sub3() }, async { sub4() }; --$pending; } elsif ($id == 2) { $t2->join(); push @threads, async { sub5() }, async { sub6() }; --$pending; } if (!$pending) { push @threads, async { sub7() }, async { sub8() }; last; } } $_->join() for @threads; [download] Update: Fixed the c&p bug identified in the reply.	[reply] [d/l] [select]
Re^2: multithreads newbie question by daverave (Scribe) on Aug 10, 2010 at 14:37 UTC
Great! Thank you, much appreciated! Just a small typo: the second `$t1->join();` should be `$t2->join();`	[reply] [d/l] [select]
Re^3: multithreads newbie question by daverave (Scribe) on Aug 10, 2010 at 15:27 UTC
Something I wasn't expecting. I'm passing one of the subroutines a hash and the subroutine does some changes in it. But those changes are not stored in the hash after the subroutines finishes! Note that there need not be a problem of sync between the threads. They do not write to common fields. as I mentioned, I actually use this in an object so I pass `$self`, but this example also illustrates the problem: ##! perl -slw use 5.010; use strict; use threads; use Thread::Queue qw( ); sub sub1 { say"sub1 starts"; sleep 1 for 1..2; say"sub1 ends" } sub sub2 { my $hashref = shift; $hashref->{KEY}=1; exists($hashref->{K +EY}) ? say "KEY exists" : say "KEY does NOT exist"; say"sub2 starts"; sleep 1 for 1..2; say"sub2 ends" } sub sub3 { say"sub3 starts"; sleep 1 for 1..5; say"sub3 ends" } sub sub4 { say"sub4 starts"; sleep 1 for 1..5; say"sub4 ends" } sub sub5 { say"sub5 starts"; sleep 1 for 1..5; say"sub5 ends" } sub sub6 { say"sub6 starts"; sleep 1 for 1..5; say"sub6 ends" } sub sub7 { say"sub7 starts"; sleep 1 for 1..2; say"sub7 ends" } sub sub8 { say"sub8 starts"; sleep 1 for 1..2; say"sub8 ends" } my $hashref = {}; my $q = Thread::Queue->new(); my $t1 = async { sub1(); $q->enqueue(1); }; my $t2 = async { sub2($hashref); $q->enqueue(2); }; my $pending = 2; my @threads; for (;;) { my $id = $q->dequeue(); if ($id == 1) { $t1->join(); push @threads, async { sub3() }, async { sub4() }; --$pending; } elsif ($id == 2) { $t2->join(); exists($hashref->{KEY}) ? say "KEY exists" : say "KEY does NOT e +xist"; push @threads, async { sub5() }, async { sub6() }; --$pending; } if (!$pending) { push @threads, async { sub7() }, async { sub8() }; last; } } $_->join() for @threads; [download]	[reply] [d/l] [select]
Re^4: multithreads newbie question by ikegami (Patriarch) on Aug 10, 2010 at 15:44 UTC
Re^5: multithreads newbie question by daverave (Scribe) on Aug 10, 2010 at 17:27 UTC
Some notes below your chosen depth have not been shown here
Re^2: multithreads newbie question by daverave (Scribe) on Aug 11, 2010 at 15:26 UTC
(please ignore)	[reply]