comment on

Hi all, have question regarding the use of MCE package, I have a workflow in mind: where I build a hash table, by reading in a couple of key-value pair file, ultimately linking fileA's keys to the fileC's values via fileB. next using this hash table i want to apply a subroutine on multiple input files and output the results into separate files. I want to try to use some parallel work here. I use R a lot for doing parallel work mainly using the mclapply function in the parallel library, I was hunting around for the parallel packages in Perl and found prefork, mce and fork manager. i tried implementing the parallel portion with MCE as shown in the following: *mostly took the code from: https://metacpan.org/pod/MCE::Examples

#!/usr/bin/env perl

use v5.18;
use strict;
use warnings;
use autodie;
use MCE;

my @input_data  = (0 .. 100 - 1);

## Make an output iterator for gather. Output order is preserved.

sub output_iterator {
   my %tmp; my $order_id = 1;

   return sub {
      my ($result, $chunk_id) = @_;
      $tmp{$chunk_id} = $result;
      while (1) {
         last unless (exists $tmp{$order_id});

         open my $output, '>', "/path/to/my/files/$chunk_id.txt";
         foreach (1..10) { print $output "\t",fibonacci($_)};
         say $output;
         close $output;

         delete $tmp{$order_id++};
      }
   };
}

## Use $chunk_ref->[0] or $_ to retrieve the element.
my $mce = MCE->new(
   chunk_size => 1,     #setting to 1 = do not chunk
   max_workers => 8,    #number of CPU cores
   gather => output_iterator(), #the function which will be applied to
+ each element of the array
   );

MCE->foreach( \@input_data, sub {
   my ($mce, $chunk_ref, $chunk_id) = @_;
   my $result = sqrt($chunk_ref->[0]);
   MCE->gather($result, $chunk_id);
});

sub fibonacci {
    my $n = shift;
    return undef if $n < 0;
    my $f;
    if ($n == 0) {
        $f = 0;
    } elsif ($n == 1) {
        $f = 1;
    } else {
        $f = fibonacci($n-1) + fibonacci($n-2);
    }
    return $f;
}
[download]

however, i noticed that the number of output files are not consistent with the size of my input ie. my array input_data. i would get 96 output files, thereafter 97, and finally 100, if i rerun the problem without deleting the output files. what's wrong?

In reply to Using MCE to write to multiple files. by etheleon

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.