karlgoethebier has asked for the wisdom of the Perl Monks concerning the following question:
Hi all,
I try to create a queue for multithreaded processing of a huge image file.
Here is what i figured out:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dump;
use Time::HiRes qw(time);
# use feature qw(say);
my $start = time;
my @queue;
my @line;
# ratio 1.25 - i guess 160 pixel
# my $width = 20;
# my $height = 8;
# my $qsize = 4;
# same ratio - i guess 20 MP
my $width = 1280 * 4;
my $height = 1024 * 4;
my $qsize = 32;
for my $x ( 0 .. $width - 1 ) {
for my $y ( 0 .. $height - 1 ) {
my $coords = [ $x, $y ];
push @line, $coords;
if ( scalar @line == $width ) {
push @queue, [@line];
@line = ();
# audacious actions start here
if ( scalar @queue == $qsize ) {
# dd \@queue;
@queue = ();
}
}
}
}
printf "Took %.3f seconds\n", time - $start;
__END__
karls-mac-mini:monks karl$ ./queue.pl
Took 13.531 seconds
It looks like it works as designed.
But i can't suppress the vague feeling that this solution is ungeschickt.
How can i improve this code?
Update: Thank you very much to all that contributed to this interesting thread. I have little time this week, but when i have studied the numerous examples given, i'll reply more detailed respectively ask more questions :-)
Thank you very much for any hint and best regards,
Karl
«The Crux of the Biscuit is the Apostrophe»
Re: Threads From Hell #3: Missing Some Basic Prerequisites
by FreeBeerReekingMonk (Deacon) on May 29, 2015 at 20:13 UTC
|
You are queueing pixel by pixel. But the threads can calculate (x,y) by themselves. Why not do the following:
my $width = 1280 * 4;
my $height = 1024 * 4;
$picturesize = $width * $height;
for(my $i=0; $i< $picturesize-1; $i+=32){
$start = $i;
$end = $i+32;
$x1 = int(($start % $width) /4);
$y1 = int(($start / $width) /4);
$x2 = int(($end % $width) /4);
$y2 = int(($end / $width) /4);
printf("(%4d,%4d)..(%4d,%4d) %d\n",
$x1,$y1,$x2,$y2, $i);
}
This way, the main program only uses the picture size (a large integer), and increases it by 32 units (4 pixels with four byte RGBA perl pixel, I guess). And you give it a start and stop range: $i .. $i+32. The thread can then use those numbers to calculate back the pixels (x,y) themselves, in parallel.
startthread($from,$to,$width)
| [reply] [d/l] |
|
( 0, 0)..( 8, 0) 0
( 8, 0)..( 16, 0) 32
( 16, 0)..( 24, 0) 64
( 24, 0)..( 32, 0) 96
( 32, 0)..( 40, 0) 128
...
Ich stehe gerade auf dem Schlauch.
Best regards, Karl
«The Crux of the Biscuit is the Apostrophe»
| [reply] [d/l] |
|
Instead of enqueueing pixels, you enqueue an offset, and a length of 32 (and the thread will also need the width of the image). With that data, each thread is able to calculate itself its tasked pixels.
Thus, if I have offset=0 the formula give calculates
(x=0,y=0) to (x=8,y=0).
If I have offset=32 the formula give calculates
(x=8,y=0) to (x=16,y=0).
Those are 8 pixels of 4 bytes each (RGBA).
you can avoid that thread-save problem with imagemagick. You read an image and $blob=$original->ImageToBlob(); then manipulate all pixels in the blob, and BlobToImage() back.
See this imagemagick post.
And also this Permonk post: Re: reading in raw data into perl's ImageMagick
At least these primitives are ok with threads? (although not your eyes, you see I used seizure inducing red and green lines... sorry for that)
#! perl
use warnings;
use strict;
use threads;
use Image::Magick;
my $image = Image::Magick->new(
size => "600x600",
);
$image->Read("xc:white");
for my $i (200..400){
async{
my $color = $i % 2 ? '#f00' : '#0f0';
$image->Draw(
primitive => 'line',
points => "$i,100 $i,500",
stroke => $color,
);
}->join;
}
$image->Set(magick=>'gif');
my $blob = $image->ImageToBlob();
open(FH,"> $0.gif")or die "$!\n";
print FH $blob;
close FH;
perldoc threads | [reply] [d/l] [select] |
|
Re: Threads From Hell #3: Missing Some Basic Prerequisites
by BrowserUk (Patriarch) on May 30, 2015 at 15:18 UTC
|
Other than the title; there doesn't seem to be any mention of "threads" in your code.
And the thing you've named "queue" is just a bog standard perl array that you push things into and then empty.
And the second part of your title doesn't seem to have any relevance to the code you posted.
You ask: How can i improve this code?
But as I can't work out what it does (by inspection); and you don't post any results; and don't say what you think might be wrong with it; its hard to begin to work out what you are looking for?
Basically, I cannot make any connection between this post and the earlier 2 in the series?
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
| [reply] |
|
"I cannot make any connection between this post and the earlier 2 in the series"
Sorry BrowserUK for being imprecise, i'll try to give some more information:
Short time ago i wrote a script for learning purposes and fun that renders a huge image (20 MP) in ~13m on my box
I'd like to write a multithreaded version of this script.
The basic idea is to do it "row by row".
Enqueue some rows of the image and process them...
* * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * *
x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x
...next step...
x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x
* * * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * * *
x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x x x x x
...proceed until finished.
The snippet in my OP calculates the coordinates for the rows.
For $width = 4; $height = 4; $qsize = 4; the output is:
[
[[0, 0], [0, 1], [0, 2], [0, 3]],
[[1, 0], [1, 1], [1, 2], [1, 3]],
[[2, 0], [2, 1], [2, 2], [2, 3]],
[[3, 0], [3, 1], [3, 2], [3, 3]],
]
@queue should perhaps better been named @rows.
Then:
my $queue = Thread::Queue->new();
$queue->enqueue(\@rows);
my @threads = map { threads->create( \&process, $queue ); } 1 .. 4;
...
Somewhere in the code:
if ( $div == 0 ) {
$image->setpixel( x => $x, y => $y, color => $black );
}
else {
$image->setpixel( x => $x, y => $y, color => $palette[$color] );
}
OK, busted. It's a Mandelbrot. My first try.
I hope very much that i could clarify the connection of my OP to the series.
Best regards, Karl
«The Crux of the Biscuit is the Apostrophe»
| [reply] [d/l] [select] |
|
Sorry BrowserUK for being imprecise,
It's not so much a matter of imprecision as omission; but no matter. To the problem.
You haven't show what where $image in $image->setpixel() comes from. Its not GD (which has setPixel()) so I'm assuming it comes from Imager or Image::Magick or a similar package.
The first question is: are the C/C++ libraries that underly that module thread-safe.
To explain. If I run this short script that uses GD, it displays a 100x100 png with the left half red and the right half blue:
#! perl -slw
use strict;
use threads;
use GD;
sub rgb2n{ local $^W; unpack 'N', pack 'CCCC', 0, @_ }
my $i = GD::Image->new( 100, 100, 1 );
# async{
$i->filledRectangle( 0,0, 50, 100, rgb2n( 255, 0, 0 ) );
# }->join;
$i->filledRectangle( 51,0, 100, 100, rgb2n( 0,0, 255 ) );
open O, '>:raw', 'junk.png' or die $!;
print O $i->png;
close O;
system 'junk.png';
However, If I uncomment the two commented lines, thus the two rectangles are drawn to the image by different threads, I get this when I run it: C:\test>junk999
gd-png: fatal libpng error: No IDATs written into file
gd-png error: setjmp returns error condition
Ie. The libgd that underlies GD isn't threadsafe; and whilst the drawing to the image in the two threads executes without errors; when it comes to writing the image out to a file, a fatal error occurs. This isn't a limitation of perl's threading, but the underlying library.
But you're using a different graphics library
Let's explore if it makes any sense to write and image from multiple concurrent threads.
Let's assume that one thread draws horizontal lines across the image in red, whilst another thread draws vertical bars down the image in blue. What should the resultant image look like?
Like this? (You'll have to use your imagination here (#is red * is blue): ** ** ** ** ** **
##**###**###**###**###**###**#
** ** ** ** ** **
##**###**###**###**###**###**#
** ** ** ** ** **
##**###**###**###**###**###**#
** ** ** ** ** **
##**###**###**###**###**###**#
** ** ** ** ** **
Or like this?: ** ** ** ** ** **
##############################
** ** ** ** ** **
##############################
** ** ** ** ** **
##############################
** ** ** ** ** **
##############################
** ** ** ** ** **
Or some variation on this?: ** ** ** ** ** **
#######**###**###**###**###**#
** ** ** ** ** **
############**###**###**###**#
** ** ** ** ** **
#################**###**###**#
** ** ** ** ** **
######################**###**#
** ** ** ** ** **
Now you're probably thinking that as you're only using setPixel, and never writing to the same pixel twice, it doesn't matter what order things occur; but most graphics libraries retain a whole bunch of state between drawing calls -- eg. current clip region; current line pattern; current alpha setting; current .... -- and if two threads start modifying that internal state without using locking, then the internals of the image; and even the entire library can become corrupted.
That's what happened with my GD example.
Enough for one reply, more in the next.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
| [reply] [d/l] [select] |
|
c:\test>junk999
Filling a 1000x1000 pixel image 1-pixel at a time took: 6.491771936s
Populating 1000x1000 AoAoAs and tranferring to another thread took: 30
+.876082897s
It takes five times longer to build the AoAoAs and pass it to another thread for drawing, than it does to do that drawing in the main thread!
But what is the purpose of generating the coordinates and wrapping them up in the AoAoAs, only for the other thread to have to use nested loops to access them, when you could just pass 1000x1000 and have the other thread do the iteration itself.
And what is the point of passing the entire data structure as a single entity? Only one thread will be able to receive it; so there's not even the possibility of using concurrency.
Perhaps your intention is to do: $Q->enqueue( @cols ); (Ie. enqueue the contents of the array, not a reference to it.)
That way, the row arrays would be pushed onto the queue as separate entities; which means that different threads could dequeue individual rows and operate upon them concurrently. Assuming a thread-safe graphics library.
But even then what is the point in generating zillions of iddy-biddy arrays containing pairs of coordinates; when you could (say) push just the information required to construct them in the target thread: # main thread
my( $X, $Y ) = ...;
for my $y ( 0 .. $Y-1 ) {
$Q->enqueue( [ $X, $y ] );
}
## drawing thread
while( my( $x, $y ) = @{ $Q->dequeue } ) {
$image->setpixel( $_, $y, ... ) for 0 .. $X-1;
}
You need the same loop code in the thread as you would to unpack the AoAs; but you only need to transfer a 1000th or less of the information between threads.
Of course, the Devil is in the detail. Those ... above represent the color; and I assume that although you haven't shown it, the intent is to calculate the color for your Mandelbrot pixels in the main thread and farm off the drawing to the threads. And your (unshown) intention is to actually enqueue [ x, y, color ], [ x, y, color ] .... But that still doesn't make sense.
Why transfer the same y-coordinate a thousand times (or however wide your image is) for each row?
And why transfer a thousand X-coordinates when they can be inferred.
Ie. Do this: ## main
for my $y ( 0 .. $Y ) {
my @colors;
for my $x ( 0 .. $X ) {
push @colors, Mandelbrot( $x, $y );
}
$Q->enqueue( [ $Y, @colors ] );
}
### thread
while( my( $y, @colors ) = @{ $Q->dequeue } ) {
$image->setpixel( $_, $y, $colors[ $_ ] ) for 0 .. $#colors;
}
The Y-coordinate once, and a list of colors for the pixels is transfered, and the X-coordinates are inferred. (The problem of thread-safety remains.)
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
| [reply] [d/l] [select] |
|
|
|
|
Re: Threads From Hell #3: Missing Some Basic Prerequisites
by marioroy (Prior) on Jun 01, 2015 at 00:32 UTC
|
In the spirit of parallelism, MCE provides a generator for sequence of numbers which is beneficial for parallelizing the outer loop. The Sereal module is requested if available. Otherwise, freeze and thaw are provided by the Storable module. The Sereal module is ~ 2x faster for large data and likely helpful after completing audacious actions with @queue possibly saved into @ret.
I'm not sure what is needed once the @queue reaches $qsize. Thus, added @ret.
The non-parallel code (by Karl) takes 8.647 seconds with the parallel code completing in 2.418 seconds.
use MCE::Flow Sereal => 1;
use Time::HiRes qw(time);
my $start = time;
my @queue;
my @line;
# same ratio - i guess 20 MP
my $width = 1280 * 4;
my $height = 1024 * 4;
my $qsize = 32;
# The bounds_only option applies to sequence of numbers
# which means to compute the begin and end boundaries only,
# not the numbers in between. Thus, workers receive 2
# numbers in @{ $chunk_ref }.
MCE::Flow::init(
max_workers => 'auto',
chunk_size => 16,
bounds_only => 1,
gather => sub {
my (@ret) = @_;
}
);
# Same as mce_flow_s sub { ... }, 0, $width - 1;
MCE::Flow::run_seq( sub {
my ($mce, $chunk_ref, $chunk_id) = @_;
for my $x ( $chunk_ref->[0] .. $chunk_ref->[1] ) {
for my $y ( 0 .. $height - 1 ) {
my $coords = [ $x, $y ];
push @line, $coords;
if ( scalar @line == $width ) {
push @queue, [@line];
@line = ();
# audacious actions start here
if ( scalar @queue == $qsize ) {
my @ret; # save output to @ret ??
# dd \@queue;
MCE->gather(@ret);
@queue = ();
}
}
}
}
}, 0, $width - 1 );
printf "Took %.3f seconds\n", time - $start;
| [reply] [d/l] |
Re: Threads From Hell #3: Missing Some Basic Prerequisites
by marioroy (Prior) on Jun 01, 2015 at 14:32 UTC
|
Updated: Added duration times for 4 and 2 workers.
Recap for the time taken for the Mandelbrot examples. Testing was done on a MBP (Haswell Core i7 quad-core at 2.6 GHz with HT) running Perl v5.16.2.
Serial code: 21.327 secs (HoH)
MCE gather: 5.795 secs (HoH) Sadly, MCE::Shared is not able to reach this
thread::shared: 19.157 secs (AoA) Locking is not necessary for this demonstration
thread::shared: 9.730 secs (AoA) However, runs faster with added lock statements
MCE::Shared: 8.743 secs (AoA) 8 workers ; made possible from parallel IPC
MCE::Shared: 11.048 secs (AoA) 4 workers ; supporting processes and threads
MCE::Shared: 18.488 secs (AoA) 2 workers
MCE::Shared is not yet available on CPAN and will be released soon along with MCE 1.7. | [reply] |
|
| [reply] |
|
Update: See this post for a faster version. A MCE + Inline::C demonstration is also included.
Parallelism is possible with just one line of code for the next demonstration. As in the previous examples, am using the sequence generator with chunk_size set to 1. Thus, $_ is the next sequence number when specifying chunk_size => 1.
mce_flow_s sub { draw_line($_) }, 0, $h - 1;
Folks will be pleased to know that this will work for non-threads in the upcoming MCE 1.7 release with MCE::Shared. One simply replaces ( use threads; use threads::shared ) with ( use MCE::Shared ) and ( my @picture : shared; ) with ( my @picture : Shared; ). That's all.
The code outputs the image in pbm format.
# synopsis: perl mandelbrot.pl 1024 > image.pbm
#
# https://github.com/nbraud/benchmarksgame/blob/master/bench/
# mandelbrot/mandelbrot.perl
#
# The Computer Language Benchmarks Game
# http://benchmarksgame.alioth.debian.org/
#
# contributed by Mykola Zubach
# shorten by Mario Roy from using MCE
use strict;
use threads;
use threads::shared;
use MCE::Flow chunk_size => 1, max_workers => 'auto';
use constant MAXITER => 50;
use constant LIMIT => 4.0;
use constant XMIN => -1.5;
use constant YMIN => -1;
use constant WHITE => "\0";
use constant BLACK => "\001";
my ($w, $h, $invN);
my @picture : shared;
sub draw_line($) {
my $y = shift;
my $line;
my $Ci = $y * $invN + YMIN;
X:
for my $x (0 .. $w - 1) {
my ($Zr, $Zi, $Tr, $Ti);
my $Cr = $x * $invN + XMIN;
for (1 .. MAXITER) {
$Zi = $Zi * 2 * $Zr + $Ci;
$Zr = $Tr - $Ti + $Cr;
$Ti = $Zi * $Zi;
$Tr = $Zr * $Zr;
if ($Tr + $Ti > LIMIT) {
$line .= WHITE;
next X;
}
}
$line .= BLACK;
}
$picture[$y] = pack 'B*', $line;
}
## MAIN()
$w = $h = shift || 200;
$invN = 2 / $w;
mce_flow_s sub { draw_line($_) }, 0, $h - 1;
binmode STDOUT;
print "P4\n$w $h\n"; # PBM image header
print @picture;
| [reply] [d/l] [select] |
Re: Threads From Hell #3: Missing Some Basic Prerequisites
by marioroy (Prior) on Jun 01, 2015 at 00:46 UTC
|
Karl later mentioned a Mandelbrot. I searched the web for a parallel example and converted the code to Perl using MCE. Like the previous example, am using the sequence generator for parallelizing the outer loop.
It takes 21.327 seconds to run serially (commented out) and 5.795 seconds with MCE from a Core i7 at 2.6 GHz (Haswell).
use strict;
use warnings;
use MCE::Flow Sereal => 1;
my ($m, $n, $count_max) = (500, 500, 2000);
my ($x_max, $x_min, $y_max, $y_min) = (
1.25, -2.25, 1.75, -1.75
);
#--------------------------------------------------------------------#
# Perl version of Mandelbrot_OpenMP by John Burkardt.
# http://people.sc.fsu.edu/~jburkardt/c_src/mandelbrot_openmp/
# mandelbrot_openmp.html
#
# Carry out the iteration for each pixel, determining count.
sub mandelbrot
{
my ($r, $g, $b, $m1, $m2) = @_;
my ($x, $x1, $x2, $y, $y1, $y2);
my ($c, %count, $i, $j, $k);
for $i ( $m1 .. $m2 ) {
for $j ( 0 .. $n - 1 ) {
$x = ( ( $j - 1 ) * $x_max
+ ( $m - $j ) * $x_min )
/ ( $m - 1 );
$y = ( ( $i - 1 ) * $y_max
+ ( $n - $i ) * $y_min )
/ ( $n - 1 );
$count{$i}{$j} = 0;
$x1 = $x;
$y1 = $y;
for $k ( 1 .. $count_max ) {
$x2 = ( $x1 * $x1 ) - ( $y1 * $y1 ) + $x;
$y2 = 2 * $x1 * $y1 + $y;
if ( $x2 < -2.0 || 2.0 < $x2 || $y2 < -2.0 || 2.0 < $y2 )
{
$count{$i}{$j} = $k;
last;
}
$x1 = $x2;
$y1 = $y2;
}
if ( ( $count{$i}{$j} % 2 ) == 1 ) {
$r->{$i}{$j} = 255;
$g->{$i}{$j} = 255;
$b->{$i}{$j} = 255;
}
else {
$c = int(
255.0 * sqrt(sqrt(sqrt($count{$i}{$j} / $count_max)))
);
$r->{$i}{$j} = $g->{$i}{$j} = 3 * $c / 5;
$b->{$i}{$j} = $c;
}
}
}
return;
}
#--------------------------------------------------------------------#
# Return the smaller of two numbers.
sub min
{
$_[0] < $_[1] ? $_[0] : $_[1];
}
# Write image to an ASCII PPM file.
sub write_to_ppm
{
my ($r, $g, $b, $ofile) = @_;
my ($FH, $i, $j, $jlo, $jhi);
open $FH, '>', $ofile or die "cannot open file ($ofile): $!\n";
print $FH "P3\n";
print $FH "$n $m\n";
print $FH "255\n";
for ( $i = 0; $i < $m; $i++ ) {
for ( $jlo = 0; $jlo < $n; $jlo += 4 ) {
$jhi = min( $jlo + 4, $n );
for ( $j = $jlo; $j < $jhi; $j++ ) {
printf $FH " %d %d %d",
$r->{$i}{$j}, $g->{$i}{$j}, $b->{$i}{$j}
}
print $FH "\n";
}
}
close $FH;
}
#--------------------------------------------------------------------#
my ($r, $g, $b) = ({}, {}, {});
# Running serially.
# mandelbrot($r, $g, $b, 0, $m - 1);
# Run parallel.
#
# The bounds_only option applies to sequence of numbers
# which means to compute the begin and end boundaries only,
# not the numbers in between. Thus, workers receive 2
# numbers in @{ $chunk_ref }.
MCE::Flow::init(
chunk_size => 5, max_workers => 'auto', bounds_only => 1,
gather => sub {
my ($rr, $gg, $bb, $m1, $m2) = @_;
for my $i ($m1 .. $m2) {
for my $j (0 .. $n - 1) {
$r->{$i}{$j} = $rr->{$i}{$j};
$g->{$i}{$j} = $gg->{$i}{$j};
$b->{$i}{$j} = $bb->{$i}{$j};
}
}
}
);
# Same as mce_flow_s sub { ... }, 0, $m - 1;
MCE::Flow::run_seq( sub {
my ($mce, $chunk_ref, $chunk_id) = @_;
my ($rr, $gg, $bb, $m1, $m2) = ({}, {}, {}, @{ $chunk_ref });
mandelbrot ($rr, $gg, $bb, $m1, $m2);
MCE->gather($rr, $gg, $bb, $m1, $m2);
}, 0, $m - 1 );
# Shutdown MCE workers.
MCE::Flow::finish;
# Write image to an ASCII PPM file.
write_to_ppm($r, $g, $b, 'mandelbrot.ppm');
| [reply] [d/l] |
Re: Threads From Hell #3: Missing Some Basic Prerequisites
by marioroy (Prior) on Jun 01, 2015 at 13:56 UTC
|
Below, a version using shared variables ($r, $g, $b) provided by threads::shared. This completes in 9.730 seconds.
use strict;
use warnings;
use threads;
use threads::shared;
use MCE::Flow;
my ($m, $n, $count_max) = (500, 500, 2000);
my ($x_max, $x_min, $y_max, $y_min) = (
1.25, -2.25, 1.75, -1.75
);
my $r = shared_clone [];
my $g = shared_clone [];
my $b = shared_clone [];
for my $i ( 0 .. $m - 1 ) {
$r->[$i] = shared_clone [];
$g->[$i] = shared_clone [];
$b->[$i] = shared_clone [];
}
#--------------------------------------------------------------------#
# Perl version of Mandelbrot_OpenMP by John Burkardt.
# http://people.sc.fsu.edu/~jburkardt/c_src/mandelbrot_openmp/
# mandelbrot_openmp.html
#
# Carry out the iteration for each pixel, determining count.
sub mandelbrot
{
my ($m1, $m2) = @_;
my ($x, $x1, $x2, $y, $y1, $y2);
my ($c, %count, $i, $j, $k);
for $i ( $m1 .. $m2 ) {
for $j ( 0 .. $n - 1 ) {
$x = ( ( $j - 1 ) * $x_max
+ ( $m - $j ) * $x_min )
/ ( $m - 1 );
$y = ( ( $i - 1 ) * $y_max
+ ( $n - $i ) * $y_min )
/ ( $n - 1 );
$count{$i}{$j} = 0;
$x1 = $x;
$y1 = $y;
for $k ( 1 .. $count_max ) {
$x2 = ( $x1 * $x1 ) - ( $y1 * $y1 ) + $x;
$y2 = 2 * $x1 * $y1 + $y;
if ( $x2 < -2.0 || 2.0 < $x2 || $y2 < -2.0 || 2.0 < $y2 )
{
$count{$i}{$j} = $k;
last;
}
$x1 = $x2;
$y1 = $y2;
}
if ( ( $count{$i}{$j} % 2 ) == 1 ) {
lock $r; # not necessary, but runs faster with locking
$r->[$i][$j] = 255;
$g->[$i][$j] = 255;
$b->[$i][$j] = 255;
}
else {
lock $r; # ditto
$c = int(
255.0 * sqrt(sqrt(sqrt($count{$i}{$j} / $count_max)))
);
$r->[$i][$j] = $g->[$i][$j] = 3 * $c / 5;
$b->[$i][$j] = $c;
}
}
}
return;
}
#--------------------------------------------------------------------#
# Return the smaller of two numbers.
sub min
{
$_[0] < $_[1] ? $_[0] : $_[1];
}
# Write image to an ASCII PPM file.
sub write_to_ppm
{
my ($ofile) = @_;
my ($FH, $i, $j, $jlo, $jhi);
open $FH, '>', $ofile or die "cannot open file ($ofile): $!\n";
print $FH "P3\n";
print $FH "$n $m\n";
print $FH "255\n";
for ( $i = 0; $i < $m; $i++ ) {
for ( $jlo = 0; $jlo < $n; $jlo += 4 ) {
$jhi = min( $jlo + 4, $n );
for ( $j = $jlo; $j < $jhi; $j++ ) {
printf $FH " %d %d %d",
$r->[$i][$j], $g->[$i][$j], $b->[$i][$j]
}
print $FH "\n";
}
}
close $FH;
}
#--------------------------------------------------------------------#
# The bounds_only option applies to sequence of numbers
# which means to compute the begin and end boundaries only,
# not the numbers in between. Thus, workers receive 2
# numbers in @{ $chunk_ref }.
# Same as mce_flow_s { mce options }, sub { ... }, 0, $m - 1;
MCE::Flow::run_seq( {
chunk_size => 5, max_workers => 'auto', bounds_only => 1
}, sub {
my ($mce, $chunk_ref, $chunk_id) = @_;
mandelbrot( @{ $chunk_ref} );
}, 0, $m - 1 );
## Write image to an ASCII PPM file.
write_to_ppm('mandelbrot.ppm');
| [reply] [d/l] |
Re: Threads From Hell #3: Missing Some Basic Prerequisites
by marioroy (Prior) on Jun 01, 2015 at 14:05 UTC
|
Finally, a version using shared variables ($r, $g, $b) provided by MCE::Shared to be released with MCE 1.7 supporting processes and threads. This completes in 8.743 seconds.
The fast option to MCE::Shared applies to deep data structures and okay to enable for this demonstration from not having circular references. Deep data sharing is fully automatic. Thus, one is not having to run shared_clone as in the previous post.
use strict;
use warnings;
use MCE::Flow;
use MCE::Shared fast => 1;
my $r = mce_share [];
my $g = mce_share [];
my $b = mce_share [];
my ($m, $n, $count_max) = (500, 500, 2000);
my ($x_max, $x_min, $y_max, $y_min) = (
1.25, -2.25, 1.75, -1.75
);
#--------------------------------------------------------------------#
# Perl version of Mandelbrot_OpenMP by John Burkardt.
# http://people.sc.fsu.edu/~jburkardt/c_src/mandelbrot_openmp/
# mandelbrot_openmp.html
#
# Carry out the iteration for each pixel, determining count.
sub mandelbrot
{
my ($m1, $m2) = @_;
my ($x, $x1, $x2, $y, $y1, $y2);
my ($c, %count, $i, $j, $k);
for $i ( $m1 .. $m2 ) {
for $j ( 0 .. $n - 1 ) {
$x = ( ( $j - 1 ) * $x_max
+ ( $m - $j ) * $x_min )
/ ( $m - 1 );
$y = ( ( $i - 1 ) * $y_max
+ ( $n - $i ) * $y_min )
/ ( $n - 1 );
$count{$i}{$j} = 0;
$x1 = $x;
$y1 = $y;
for $k ( 1 .. $count_max ) {
$x2 = ( $x1 * $x1 ) - ( $y1 * $y1 ) + $x;
$y2 = 2 * $x1 * $y1 + $y;
if ( $x2 < -2.0 || 2.0 < $x2 || $y2 < -2.0 || 2.0 < $y2 )
{
$count{$i}{$j} = $k;
last;
}
$x1 = $x2;
$y1 = $y2;
}
if ( ( $count{$i}{$j} % 2 ) == 1 ) {
$r->[$i][$j] = 255;
$g->[$i][$j] = 255;
$b->[$i][$j] = 255;
}
else {
$c = int(
255.0 * sqrt(sqrt(sqrt($count{$i}{$j} / $count_max)))
);
$r->[$i][$j] = $g->[$i][$j] = 3 * $c / 5;
$b->[$i][$j] = $c;
}
}
}
return;
}
#--------------------------------------------------------------------#
# Return the smaller of two numbers.
sub min
{
$_[0] < $_[1] ? $_[0] : $_[1];
}
# Write image to an ASCII PPM file.
sub write_to_ppm
{
my ($ofile) = @_;
my ($FH, $i, $j, $jlo, $jhi);
my $re = $r->export; # work is completed, okay to export
my $ge = $g->export;
my $be = $b->export;
open $FH, '>', $ofile or die "cannot open file ($ofile): $!\n";
print $FH "P3\n";
print $FH "$n $m\n";
print $FH "255\n";
for ( $i = 0; $i < $m; $i++ ) {
for ( $jlo = 0; $jlo < $n; $jlo += 4 ) {
$jhi = min( $jlo + 4, $n );
for ( $j = $jlo; $j < $jhi; $j++ ) {
printf $FH " %d %d %d",
$re->[$i][$j], $ge->[$i][$j], $be->[$i][$j]
}
print $FH "\n";
}
}
close $FH;
}
#--------------------------------------------------------------------#
# The bounds_only option applies to sequence of numbers
# which means to compute the begin and end boundaries only,
# not the numbers in between. Thus, workers receive 2
# numbers in @{ $chunk_ref }.
# Same as mce_flow_s { mce options }, sub { ... }, 0, $m - 1;
MCE::Flow::run_seq( {
chunk_size => 5, max_workers => 'auto', bounds_only => 1
}, sub {
my ($mce, $chunk_ref, $chunk_id) = @_;
mandelbrot( @{ $chunk_ref} );
}, 0, $m - 1 );
# Write image to an ASCII PPM file.
write_to_ppm('mandelbrot.ppm');
| [reply] [d/l] |
Re: Threads From Hell #3: Missing Some Basic Prerequisites
by marioroy (Prior) on Jun 06, 2015 at 00:38 UTC
|
use strict;
use warnings;
use GD;
use MCE::Loop;
# based on original http://www.alfrog.com/mandel.html
# karlgoethebier: code refactor
# marioroy: parallelization
my $width = 1280;
my $height = 1024;
my $image = new GD::Image( $width, $height );
my @palette = $image->colorAllocate( 0, 0, 0 ); # black
my $iterations = 20;
for ( 1 .. $iterations ) {
my ( $r, $g, $b ) = map { int rand 255 } 1 .. 3;
push @palette, $image->colorAllocate( $r, $g, $b );
}
MCE::Loop::init {
use_threads => 0, # Windows default to threads, not thread safe
max_workers => 'auto', bounds_only => 1, gather => sub {
my ( $x, $y, $color );
while ( ($x, $y, $color) = splice(@_, 0, 3) ) {
$image->setPixel( $x, $y, $palette[ $color ] )
}
}
};
mce_loop_s {
my ( $mce, $chunk_ref, $chunk_id ) = @_;
my ( $re_c, $im_c, $re_z, $im_z, $color, $temp );
my ( @set_data );
for my $x ( $chunk_ref->[0] .. $chunk_ref->[1] ) {
for my $y ( 0 .. $height - 1 ) {
$re_c = ( $x - 3 * $width / 4 ) / ( $width / 3 );
$im_c = ( $y - $height / 2 ) / ( $width / 3 );
$re_z = $im_z = $color = 0;
while ( 1 ) {
$temp = $re_z;
$re_z = $re_z * $re_z - $im_z * $im_z + $re_c;
$im_z = 2 * $temp * $im_z + $im_c;
++$color;
last if $re_z * $re_z + $im_z * $im_z > 4;
if ( $color == $iterations ) {
$color = 0;
last;
}
}
push @set_data, $x, $y, $color;
}
}
MCE->gather( @set_data );
} 0, $width - 1;
MCE::Loop::finish;
open my $fh, '>mandelbrot.png' || exit 1;
binmode $fh;
print $fh $image->png;
close $fh;
| [reply] [d/l] |
|
|