monkfan has asked for the wisdom of the Perl Monks concerning the following question:

Dear Fellow Monks,

I have the a task that sums up the results of two functions. It looks like this:
Total Sum = Function_1 + Function_2
Now these two functions can be executed in parallel way. Meaning that the output of Function_1 doesn't depend on Function_2 or vice versa. The problem is that these functions require a very time consuming process. That's why I need it to run in parallel.

For simplicity sake, the function basically takes a number as an upperbound of series of value and perform some arithmetic on them. The overall non-parallel and very time consuming process can be seen in the current code of mine:

#!/usr/bin/perl -w use strict; sub Function_1 { my $up_bound1 = shift; my $total_f1; foreach my $in (0..$up_bound1) { $total_f1 += $in/3; } return $total_f1; } sub Function_2 { my $up_bound2 = shift; my $total_f2; foreach my $in (0 .. $up_bound2) { $total_f2 += $in; } return $total_f2; } ## Begin main process my $up_bound = 10; my $sum = Function_1($up_bound)+Function_2($up_bound); print "$sum\n";
Which yields: 73.33333333333333
To speed up the code above, I am trying to use POE, especially running F1 and F2 in parallel. The main idea is this: And I would like to pack them inside a subroutine that returns the single final summed value.

This is the first time I am using it. So please kindly bear with me. Here is the sample code I have, in which I'm totally lost. Not knowing how to go about it:
#!/usr/bin/perl use warnings; use strict; use POE; my $up_bound = 10; my $final_result = parallel_with_poe($up_bound); sub parallel_with_poe { $ubound = shift; POE::Session->create( inline_states => { # How can I pass the parameter correctly here? _start => Function_1($_KERNEL,$ubound); to_wait => sub { $_[KERNEL]->(delay (tick=>2)); } run_function_2 => Function_2($_KERNEL,$ubound); # This ain't right also, I don't know # how to pass the above result into the function _stop => final_sum($_KERNEL,$ubound); }, ); $poe_kernel->run(); return; # how to return the final value? #exit(0); } sub final_sum { my ($kernel,$ans_of_f1, $ans_of_f2) = @_; my $sum = $ans_of_f1+$ans_of_f2; return $sum; } sub Function_1 { my ($kernel_f1, $up_bound1) = @_; my $total_f1; foreach my $in ( 0 .. $up_bound1) { $total_f1 += $in/3; } return $total_f1; } sub Function_2 { my ($kernel_2,$up_bound2) = @_; my $total_f2; foreach my $in (0 .. $up_bound2) { $total_f2 += $in; } return $total_f2;
Thus, I humbly seek for enlightment from my fellow monks.

Regards,
Edward

Replies are listed 'Best First'.
Re: My First POE - Simple Parallel Programming
by nothingmuch (Priest) on Jan 31, 2006 at 13:01 UTC
    While POE is an event system that allows you to order event handlers in a way that is more logical for network apps (small handlers for event, possibly creating more events that require more handlers) POE itself is not multithreaded - it looks at the unhandled events and decides which handler to execute, executes it, and repeats the decision making when the handler is finished.

    POE is about a certain type of structure.

    You should look into either threads or forking to run things in parallel. Both have limitations.

    -nuffin
    zz zZ Z Z #!perl
Re: My First POE - Simple Parallel Programming
by Ultra (Hermit) on Jan 31, 2006 at 15:09 UTC

    Below you can find a program to get you started (read the comments).

    It uses POE::Wheel::Run which means that if forks your functions like they would be some programs, collects their STDOUT and the sums it up.

    Please note that TIMTOWTDI applies to POE as well ;-) so there are other ways to achieve the same purpose. i.e. look into POE::Filter::Reference for passing data between processes etc.

    use strict; use warnings; use POE qw(Wheel::Run); POE::Session->create ( inline_states => { _start => \&start, stdout => \&stdout, done => \&done, }, heap => { sum => 0 } # here your sum will endup ); POE::Kernel->run(); exit; sub start{ my ( $kernel, $heap ) = @_[KERNEL, HEAP]; # If you have more functions with similar interface, just create a + loop my $function = POE::Wheel::Run->new( Program => sub { Function_1( 10 ) }, StdoutEvent => 'stdout', CloseEvent => 'done', ); $heap->{function}->{ $function->ID } = $function; $function = POE::Wheel::Run->new( Program => sub { Function_2( 10 ) }, StdoutEvent => 'stdout', CloseEvent => 'done', ); # store the wheel, so that its refcount is incremented $heap->{function}->{ $function->ID } = $function; } sub stdout { my ($heap, $result ) = @_[HEAP, ARG0]; $heap->{sum}+=$result; } sub done{ my ( $kernel, $heap, $function_id ) = @_[ KERNEL, HEAP, ARG0 ]; # delete the reference to the function that has ended, so it may b +e garbage # collected delete $heap->{function}->{$function_id}; # No more childs, print the total amount # alternately you could send a message back to a parent session wi +th the # result if ( scalar( keys( %{$heap->{function}})) == 0 ){ print "RESULT: ", $heap->{sum},"\n"; } } # The STDOUT of your function is "caught" and returned sub Function_1 { my $up_bound1 = shift; my $total_f1; foreach my $in (0..$up_bound1) { $total_f1 += $in/3; } print "$total_f1\n"; return $total_f1; } sub Function_2 { my $up_bound2 = shift; my $total_f2; foreach my $in (0 .. $up_bound2) { $total_f2 += $in; } print "$total_f2\n"; return $total_f2; }

    Dodge This!

      Reading [id://BrowserUK]'s post Re^3: My First POE - Simple Parallel Programming, I feel that some clarifications are needed in regards with my example code:

      • You can use the forked approach.
      • You can use the single select based process model
      • You can use multi-machine model, using POE::Component::IKC
      • Or more

      With this particular simple example, it may seem that the POE solution is bigger (in terms of amount of code written) than the threaded one. Now consider extending the application, so you need to run those functions on different machines. Here POE::Component::IKC comes in handy. I guess the threaded solution would be much bigger (again in terms of amount of code written). But I don't want to start a debate on this.

      Also, the use of POE::Wheel::Run and POE::Filter::Reference makes it a piece of cake to "port" a basic forked application to POE.

      Last and not least, IMO POE makes more sense for those (like me) that find it easier to think in terms of a Finite State Machine.

      p.s.: I deliberately ignored the "speed" issues between the different implementations.

      Dodge This!
Re: My First POE - Simple Parallel Programming
by diego_de_lima (Beadle) on Jan 31, 2006 at 11:58 UTC
    I think you should try using "threads" for this task, not POE.

    As long as I know, POE is much more a syncronous event driven framework. So, itīs not as easy to use as threads on assyncronous tasks like this.

    Diego de Lima
Re: My First POE - Simple Parallel Programming
by BrowserUk (Patriarch) on Jan 31, 2006 at 17:09 UTC

    You've got a POE solution. Here's what the threads solution looks like:

    #!/usr/bin/perl use warnings; use strict; use threads; sub Function_1 { my $up_bound1 = shift; my $total_f1; foreach my $in (0..$up_bound1) { $total_f1 += $in/3; } return $total_f1; } sub Function_2 { my $up_bound2 = shift; my $total_f2; foreach my $in (0 .. $up_bound2) { $total_f2 += $in; } return $total_f2; } my $up_bound = 10; my $t = threads->new( \&Function_1, $up_bound ); my $sum = Function_2( $up_bound ) + $t->join; print "The total is $sum"; __END__ c:\Perl\test>junk6 The total is 73.3333333333333

    However, unless you have multiple CPUs, that will not run any faster than simply doing the two sequentially. This is a cpu bound process, and unless you have multiple cpus, both calculations will be time-sliced on the same cpu, and that will probably take longer than running them sequentially due to the overhead of task switching.

    That said, the POE solution will never run more quickly. Regardless of whether you have multiple cpus or not. Even if there are multiple cpus, it will not make use of them. It will simply be time-slicing your single threaded process, and will run more slowly because of it, and all that extra complexity will have bought you nothing. POE will only benefit you performance-wise if your process is doing a lot of IOwaits which it can utilise to do other processing. POE is very clever, but it isn't designed for cpu intensive processing.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Well, the POE::Wheel::Run solution will take advantage of multiple CPUs and run faster. Not a single process POE solution of course.

        Okay. I guess I didn't read the POE example closely enough and was assuming a 'normal' POE event driven, single process approach--and it does actually mention forking at the top.

        That said, it would be an interesting exercise to see how much calculation the asynced sub would have to be doing in order to offset the overhead of spawning (two?) new processes and all the IPC etc. before it would show benefit over just calling them serially. I would imagine the breakeven point would be pretty high.

        That said, this application isn't really what POE is designed for. I think POE is very clever--just rather complex. Going by the list of around 400 POE related modules on CPAN, it seems as if every new problem requires a unique solution tailored to that specific problem? Even then, the applications using those modules hardly seem simple, and the use of POE is far from transparent. Threads just seem easier to use to me :)


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      However, unless you have multiple CPUs, that will not run any faster than simply doing the two sequentially.
      Dear BrowserUK,
      Indeed using threads is much simpler than POE. Thanks for introducing it to me. BTW does the forking feature of that code above automatically take effect if I run that code of yours in a Linux Cluster?

      I mean I have a Linux Cluster with 40 CPUs. Can I run your code above with the forking effect just by saving this in bash script (let's call it my_bash.sh):
      perl junk6.pl
      Then run it the following way with qsub command:
      my_home_node $ qsub my_bash.sh
      It will submit the job into a particular node of the 40 CPUs. Or is there any other way to do it?

      Regards,
      Edward

        No it won't. Had you mentioned clustering in your OP I would not have offered a threaded solution.

        That said, a clustering solution based around threads would be far simpler to both implement and use, but I do not have one to offer you so for your situation, POE is the right (only) Perl solution.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.