comment on

Hello,

Below is another variation. Here, workers write directly to the output handle, similarly to testa. The MCE relay option when defined loads MCE::Relay and with that enables relay capabilities. Relay is beneficial in places where workers must run orderly and serially. Only a single worker can run inside the relay block below. The important thing is that workers enter it orderly by chunk_id. In other words, workers wait their turn. The worker with chunk_id 1 goes first, then worker with chunk_id 2 next, and so forth.

I forgot to mention that MCE can spawn threads. Simply add "use threads" at the top of the script, prior to loading MCE. This allows the use of Thread::Queue, Thread::Semaphore, and friends. If curious, compare memory consumption with testa against this one. I increased $iterations to 1000 to be able to monitor the process in another window. Typically, running without threads is faster on Unix. Either way, the option is yours to make if threads is a better fit; e.g wanting to use Threads::Queue.

use strict;
use warnings;

use MCE;
use Time::HiRes 'time';

my $iterations = 100;
my $chunksize = 50;
my $threads = 5;
my $output = "m.txt";

my %data = ();

foreach ('a'..'z') {
   $data{$_} = $_ x 200;
}

open my $fh, '>', $output or die "open error: $!";

$fh->autoflush(1); test_mce(); close $fh;

sub test_mce {
   my $start = time;

   my $mce = MCE->new(
      max_workers => $threads, chunk_size => $chunksize,
      input_data => input_iter($chunksize, $iterations),
      user_func => \&work,
      init_relay => 0,
   )->run();

   printf STDERR "testa done in %0.02f seconds\n", time - $start;
}

# make an input closure, return iterator

sub input_iter {
   my ($chunk_size, $iterations) = @_;
   my $seq_a = 1;

   return sub {
      return if $seq_a > $iterations;
      my ($chunk_size) = @_;
      my @chunk = ();

      foreach my $seq_b ( 1 .. $chunk_size ) {
         my %retdata = %data;
         $retdata{'.'} = $seq_a * $seq_b;
         push @chunk, \%retdata;
      }

      $seq_a += 1;
      return \@chunk;
   };
}

# MCE task to run in parallel

sub work {
   my ($mce, $chunk_ref, $chunk_id) = @_;
   my $data = $chunk_ref->[0];
   my @ret = ();

   foreach my $chunk (@$data) {
      my %output = ();

      foreach my $key (keys %$chunk) {
         if ($key eq '.') {
            $output{$key} = $$chunk{$key};
            next;
         }
         my $val = $$chunk{$key};
         my $uc = uc($key);
         $val =~ s/$key/$uc/g;
         $output{$key} = $val;
      }

      push(@ret,\%output);
   }

   my $buf = '';

   foreach my $data (@ret) {
      foreach my $key (sort keys %$data) {
         $buf .= $$data{$key};
      }
      $buf .= "\n";
   }

   MCE::relay { print {$fh} $buf };
}
[download]

Regards, Mario.

In reply to Re^5: shared scalar freed early by marioroy
in thread shared scalar freed early by chris212

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.