In the spirit of parallelism, MCE provides a generator for sequence of numbers which is beneficial for parallelizing the outer loop. The Sereal module is requested if available. Otherwise, freeze and thaw are provided by the Storable module. The Sereal module is ~ 2x faster for large data and likely helpful after completing audacious actions with @queue possibly saved into @ret.
I'm not sure what is needed once the @queue reaches $qsize. Thus, added @ret.
The non-parallel code (by Karl) takes 8.647 seconds with the parallel code completing in 2.418 seconds.
use MCE::Flow Sereal => 1;
use Time::HiRes qw(time);
my $start = time;
my @queue;
my @line;
# same ratio - i guess 20 MP
my $width = 1280 * 4;
my $height = 1024 * 4;
my $qsize = 32;
# The bounds_only option applies to sequence of numbers
# which means to compute the begin and end boundaries only,
# not the numbers in between. Thus, workers receive 2
# numbers in @{ $chunk_ref }.
MCE::Flow::init(
max_workers => 'auto',
chunk_size => 16,
bounds_only => 1,
gather => sub {
my (@ret) = @_;
}
);
# Same as mce_flow_s sub { ... }, 0, $width - 1;
MCE::Flow::run_seq( sub {
my ($mce, $chunk_ref, $chunk_id) = @_;
for my $x ( $chunk_ref->[0] .. $chunk_ref->[1] ) {
for my $y ( 0 .. $height - 1 ) {
my $coords = [ $x, $y ];
push @line, $coords;
if ( scalar @line == $width ) {
push @queue, [@line];
@line = ();
# audacious actions start here
if ( scalar @queue == $qsize ) {
my @ret; # save output to @ret ??
# dd \@queue;
MCE->gather(@ret);
@queue = ();
}
}
}
}
}, 0, $width - 1 );
printf "Took %.3f seconds\n", time - $start;