Veltro has asked for the wisdom of the Perl Monks concerning the following question:
Hello everyone,
I have constructed this piece of code to illustrate three problems and I would like to know: How would you write this code so that it works?
Problem 1: The join is located at a problematic position due to the nature of how hashes work. Sometimes the code takes 6 seconds to execute and sometimes 2 + 6 seconds.
Problem 2: L2_counter1 => 3 is not incremented
Problem 3: (Is actually a question) Why do I need to use a share? The data is incremented independently, so I would think there should be no problems regards synchronicity. However if I don't use the share, nothing gets incremented at all...
Here is the code:
use strict ; use warnings ; use MCE::Hobo ; use MCE::Shared ; use Data::Dumper ; sub task1 { print "Starting task 1 for $_[0]\n" ; sleep(2) ; print "Finished task 1 for $_[0]\n" ; } sub task2 { print "Starting task 2 for $_[0]\n" ; sleep(4) ; print "Finished task 2 for $_[0]\n" ; } sub task3 { print "Starting task 3 for $_[0]\n" ; sleep(6) ; print "Finished task 3 for $_[0]\n" ; } MCE::Hobo->init( max_workers => 2, # hobo_timeout => 10, # posix_exit => 1, ) ; my $mutex = MCE::Mutex->new; my $_test = { L1_counter1 => 1, # L1_counter2 => 2, # L1_counter3 => 3, nested1 => { L2_counter1 => 3, # L2_counter2 => 2, # L2_counter3 => 1, }, } ; my $test ; tie %{$test}, 'MCE::Shared', { module => 'MCE::Shared::Hash' }, %{$_te +st} ; print Dumper( $test ) ; sub executeTasks { my $in = $_[0] ; my $hobo ; foreach( keys %{$in} ) { if ( ref $in->{ $_ } eq 'HASH' ) { executeTasks( $in->{ $_ } ) ; } else { if ( $in->{ $_ } == 1 ) { $hobo = mce_async { task1( $_ ) ; ++$in->{ $_ } ; } ; } elsif ( $in->{ $_ } == 2 ) { $hobo = mce_async { task2( $_ ) ; ++$in->{ $_ } ; } ; } elsif ( $in->{ $_ } == 3 ) { $hobo = mce_async { task3( $_ ) ; ++$in->{ $_ } ; } ; } ; } ; } ; $hobo->join() ; } ; executeTasks( $test ) ; print "\n" ; print Dumper( $test ) ;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Hobo with a bit of recursion
by marioroy (Prior) on Jun 17, 2018 at 01:56 UTC | |
Updated: Fully Perl-like behavior for the 1st demonstration. Greetings Veltro, Your post presents an interesting use case, for sure. Let me try and hoping that it all works, eventually :) Q & A Problem 1: The join is located at a problematic position due to the nature of how hashes work. Sometimes the code takes 6 seconds to execute and sometimes 2 + 6 seconds. Hashes are not ordered in Perl. The ref key-value went first. In your execute routine, $hobo->join is called. The remedy is to remove the $hobo->join statement out of the routine. It is not needed inside the execute routine when max_workers is given to MCE::Hobo->init. Problem 2: L2_counter1 => 3 is not incremented. The tie statement does not deeply share key-values during construction (bug, fix planned for v1.837). The way to shared nested hash/array structures is explicitly via the STORE method. Problem 3: Why do I need to use a share? The data is incremented independently, so I would think there should be no problems regards synchronicity. However if I don't use the share, nothing gets incremented at all... The short answer is that workers have unique copies for non-shared variables. Thus, sharing is necessary. MCE::Shared spawns a separate process (a thread on the Windows platform) where the shared data resides. Workers including the main process communicate to the shared-manager process using sockets. Demo 1: via Perl-like behavior Shown with mutex in the event multiple workers update the same key. The reason is because ++ involves two trips to the shared-manager process { FETCH and STORE }.
Demo 2: using the OO interface This eliminates having a mutex at the application level. Btw, the OO interface does not involve TIE for lesser overhead.
Output
Regards, Mario | [reply] [d/l] [select] |
by Veltro (Hermit) on Jun 17, 2018 at 16:23 UTC | |
Hello marioroy, Thanks for your excellent reply. Besides giving me a couple of very nice examples you also explained sharing to me and the importance of it. I start to see now that I definitively need to do some more studying regards this subject. One of your answers (in italic below) leads me to another question regards sharing nested objects In case objects are shared that do not have STORE and FETCH methods, and they have nested methods and objects, does this mean that the methods cannot be reached and the objects don't get to be shared? Example, Bar object has nested Foo object (inside hash key nestedFoo): Foo and Bar definitions: Read more... (1355 Bytes)
Main:
Is there a way to 'get' the nestedFoo in shared context? Is there a way to execute task for Foo without export? edit: I forgot to mention the nested _var in each of the objects. What I mean with 'shared context' is that each _var become 2 inside of the share after the task methods have been executed. | [reply] [d/l] [select] |
by marioroy (Prior) on Jun 18, 2018 at 04:42 UTC | |
Greetings Veltro, Yet another interesting use case :) Q & A In case objects are shared that do not have STORE and FETCH methods, and they have nested methods and objects, does this mean that the methods cannot be reached and the objects don't get to be shared? Is there a way to 'get' the nestedFoo in shared context? Is there a way to execute task for Foo without export? For shared objects, think of them as having an entry point into the shared-manager process. Important for shared-objects, in the case of MCE::Shared, is to pass arguments instead of dereferencing. Please note that calling a method on a shared-object is executed by the shared-manager where the data resides. For this use case, embed the shared-data object inside the class. That will allow workers to run in parallel and update shared-data accordingly. Demo 1: via Perl-like behavior Shown with mutex in the event multiple workers update the same key. Like in the prior post, the reason is because ++ involves two trips to the shared-manager process { FETCH and STORE }. I added an export routine to filter out the mutex handle.
Demo 2: using the OO interface This eliminates the mutex at the application level. Here, the export routine calls export on the shared-data object.
Output
Regards, Mario | [reply] [d/l] [select] |
|
Re: Hobo with a bit of recursion
by marioroy (Prior) on Jun 20, 2018 at 23:00 UTC | |
Greetings, Veltro, I came back to revisit this and did a test. There is a bug in MCE::Shared. The shared hash via the TIE interface and specifying the module option, set to 'MCE::Shared::Hash', should deeply-share automatically when passing key-value pairs during construction. Thank you, for posting. I will make a new release v1.837 with the fix. Test Script
Output
Changes In the meantime, here are the changes needed in your script to run properly (3 places). The ref is a 'MCE::Shared::Object'.
Regards, Mario | [reply] [d/l] [select] |
by Veltro (Hermit) on Jun 21, 2018 at 21:03 UTC | |
Hello marioroy, Thanks for the warning. Luckily I didn't run into this problem since I started to build on one of your examples using the shorthand tie my %test, 'MCE::Shared' ; The sharing mechanism actually works all very well using the above statement causing me quite some problems because I did something rather dull. In this example below I successfully managed to destroyed the original hash $h because I didn't realize the consequence of the nested reference (LOL):
Don't worry too much about whatever I am trying. I am still in the process of understanding the MCE library and for me this is all a big learning exercise for personal development/hobby and all your help is greatly appreciated. Couple of other things that I ran into during my endeavors are MCE::Hobo->pending();. It returns 0 when called inside a async block. Is there another way to get the number of pending threads? Another thing I was wondering about, if it is possible to create an hash/object and transfer the control to the mother process (without using the sharing mechanism). This would be useful if the object can be created autonomously by a worker process and cut out the extra overhead needed for sharing. Especially if this could be done by passing a reference. (Maybe $mce->gather)? Thanks, | [reply] [d/l] [select] |
by marioroy (Prior) on Jun 22, 2018 at 00:34 UTC | |
Update: Added demonstration using MCE::Inbox Greetings Veltro, * I successfully managed to destroyed the original hash $h because I didn't realize the consequence of the nested reference... Nested structure is moved to the shared manager process where the shared data resides. Use clone to not alter the original hash.
MCE::Hobo supports nested spawning. The worker hasn't spawned any Hobo's. A way is via messaging between workers and the manager process. See example on Github using MCE::Inbox (not yet released on MetaCPAN).
The join method in MCE::Hobo is wantarray-aware. It is useful for obtaining data from the hobo process. The data structure is serialized automatically in memory via MCE::Shared.
Another way is the on_finish callback which runs under the mother process. It resembles the on_finish callback in Parallel::ForkManager.
Regards, Mario | [reply] [d/l] [select] |