in reply to Re: Getting started with MCE (the Many-Core Engine)
in thread Getting started with MCE (the Many-Core Engine)

Eureka! Adding a billion numbers was taking the usual single perl process 26.8 seconds but MCE spawns 8 perls (1 process per CPU core with get_ncpu), giving the fan a nice workout and doing it in 7.3 seconds! 365% faster FTW!
time perl -le '$x=1e9;$y=0;$y+=$_ for 1..$x;print$y' 500000000500000000 real 0m26.796s user 0m26.617s sys 0m0.088s time mce (your script) 500000000500000000 real 0m7.323s user 0m55.095s sys 0m0.127s
Thank you so much for the example code. I can now see the meaning of chunk_size and how to share data between processes. MCE is a really beautiful distribution with comprehensive documentation and tons of examples (and a couple of cookbooks). I'm just not familiar with the concepts and jargon of parallel processing and was feeling kinda lost, but less now. Thanks to you! :-)

Time for another go at the docs while I figure out how to adapt this to incrementing strings perl-style:

perl -le '$a="a";for(1..30){print$a++}'

Replies are listed 'Best First'.
Re^3: Getting started with MCE (the Many-Core Engine)
by 1nickt (Canon) on Jun 11, 2018 at 18:14 UTC

    how to adapt this to incrementing strings perl-style

    See a a recent example where I showed exactly that, using tie with MCE::Shared (that example shows incrementing the value of keys in a hash; see the MCE::Shared doc for sharing a scalar via tie).

    Alternatively, if you are using MCE::Shared's OO interface it provides sugar methods (shown here using MCE::Hobo for workers):

    my $shared = MCE::Shared->scalar(0); sub task { $shared->incrby(1); } MCE::Hobo->init( max_workers => 8, posix_exit => 1 ); MCE::Hobo->create( \&task, $_ ) for 0 .. 41; MCE::Hobo->wait_all; END { print "The answer is $shared\n"; }

    Hope this helps!


    The way forward always starts with a minimal test.
      Thanks again but I couldn't get Hobo to do it. I'm just a humble pilgrim in the Holy Land of MCE. Thanks to the seemingly infinite amount of elaborate example code I have tried many things. As you will see below my cargo cult strategy simply replaces numbers with letters to see if it works. MCE::Candy caught my eye for preserving output order because my old method did that, even thought it's not necessary. I found the magic in MCE::Candy::out_iter_array:
      #!/usr/bin/perl use strict; use warnings; use MCE; use MCE::Candy; my $volume = 26*26; my $max_workers = 4; my $chunk_size = int $volume / $max_workers; my @results; my $mce = MCE->new( max_workers => $max_workers, chunk_size => $chunk_size, gather => MCE::Candy::out_iter_array(\@results), user_func => sub { my ($mce, $chunk_ref, $chunk_id) = @_; my @output; foreach my $item (@{ $chunk_ref }) { push @output, $item++; } $mce->gather($chunk_id, @output); } ); $mce->process([ 'aa' .. 'zz' ]); print "$_, " for @results; print scalar @results, "\n";
      https://github.com/marioroy/mce-perl/blob/master/README.md

        Greetings,

        Some helpful tips for processing a large array.

        Spawn workers early before creating or obtaining a large array to be used as input data. Dividing the work equally by the number of workers is not recommended for large data sets. A chunk_size value of 4000 or 8000 is fine for large arrays. It doesn't take much (chunk_size wise) for IPC to not become the bottleneck. Finally, workers persist after processing (re: $mce->process). Thus, shutdown workers when completed. This is done for you when the script terminates if omitted.

        #!/usr/bin/perl use strict; use warnings; use MCE; use MCE::Candy; my $volume = 26*26; my $max_workers = 4; my $chunk_size = int $volume / $max_workers / 16; my @results; my $mce = MCE->new( max_workers => $max_workers, chunk_size => $chunk_size, gather => MCE::Candy::out_iter_array(\@results), user_func => sub { my ($mce, $chunk_ref, $chunk_id) = @_; my @output; foreach my $item (@{ $chunk_ref }) { push @output, $item++; } $mce->gather($chunk_id, @output); } )->spawn; $mce->process([ 'aa' .. 'zz' ]); $mce->shutdown; print "$_, " for @results; print scalar @results, "\n";

        Regards, Mario

Re^3: Getting started with MCE (the Many-Core Engine)
by vr (Curate) on Jun 11, 2018 at 19:45 UTC
    incrementing strings perl-style

    You mean, as in perlop, string

    matches the pattern /^[a-zA-Z]*[0-9]*\z/, the increment is done as a string, preserving each character within its range, with carry
    ? What an unusual and interesting task. What's the context? And string is huge enough for parallelization overhead to pay off? Off the top of my head, split the string into more or less equal substrings (their number equal to number of workers), guaranteed not to carry when incremented required number of times -- or, find indices within original, then feed the data to MCE::Map, then concatenate the output :-)
      > You mean, as in perlop

      Yes and I've been re-reading that page for weeks :-)

      What an unusual and interesting task. What's the context? And string is huge enough for parallelization overhead to pay off?

      I'm doing with words what mathematicians do with numbers. It's very similar to searching for prime numbers. These liguistic primes are defined by rules of grammar. So my data is not huge strings but huge sets of billions, and trillions, and hundreds of trillions of uniform sized strings. You nailed what I have to do by suggesting chunking the data to equal sizes guaranteed not to increment. Perfect for MCE! Thank you.

      This one liner prints all my data, one string at a time:

      perl -e '$a="a";$n=1;$s=time;print"SEC\tWORD\tITER\n";while(){print"\r +",time-$s,"\t$a\t$n ";$a++&&$n++}'
      I'll post a root node with more detail once I get it ported over to MCE.
Re^3: Getting started with MCE (the Many-Core Engine)
by Anonymous Monk on Jun 12, 2018 at 17:18 UTC
    Yes I did notice that 8 perl processes vs 1 "only" sped things up about 3.5x. Tests show no gains over 4 workers here. I guess these i7 processors have 4 physical cores virtualized to 8 marketing thingies. Without MCE my computationally intensive code seemed to be using the resources of 1/2 of 1 of the four real cores (1 virtual core? teenage CPU %s). MCE appears to unlock the missing ~85% of my CPU power. It's like I downloaded 3.5 more of these 4k computers from CPAN for FREE with MCE (and 1 post++ from Perlmonk vr): A $14,000 value! Unreal...