buk2() below is about 50% faster than buk(), but the best is buk3() which 36x faster than buk() and a cool 7000 times faster than yr(). (That astounded me, and I didn't believe it at first, but it's true!):

`C:\test>1176081 -WIDTH=1000 -HEIGHT=1000
yr() took 295.895124
buk() took 1.483303
buk2() took 1.113015
buk3() took 0.042507
`

And a run on a 10kx10k "image" (without yr() as it would take days.):

`C:\test>1176081 -WIDTH=10000 -HEIGHT=10000
yr() took 0.000003
buk() took 150.481585
buk2() took 102.911382
buk3() took 3.710700
`

The test is only a crude simulation, so things may not pan out quite so well with the real data, but it worth a look :)

My test harness:

`#! perl -slw
use strict;
use Time::HiRes qw[ time ];
use Data::Dump qw[ pp ];
use constant { LEFT=>0, RIGHT=>1, TOP=>2, BOTTOM=>3 };
our $WIDTH //= 1000;
our $HEIGHT //= 1000;
sub makeObj{
my( $img, $x, $y, $size, $c ) = @_;
for my $y1 ( $y - ( $$size / 2 ) .. $y + ( $$size / 2 ) ) {
return () unless substr( $$img, $y1*$WIDTH + $x-(($$size+1)/2)
+, $$size ) = chr(0)x($$size);
}
for my $y1 ( $y - ( $$size / 2 ) .. $y + ( $$size / 2 ) ) {
substr( $$img, $y1 * $WIDTH + $x-(($$size+1)/2), $$size+2 ) =
+$c x($$size+2);
}
return 1;
}
sub yr {
# use integer; ## using int() below seemed faster than this.
my $str = shift;
my @b = map { [ [ $WIDTH, 0 ], [ $HEIGHT, 0 ] ] } 1 .. 256; #$s->
+max;
while ( $$str =~ /[^\x00]/g ) {
my $i = pos( $$str ) - 1;
my $x = $i % $WIDTH;
my $y = int( $i / $WIDTH );
my $c = ord( $& ) - 1;
$b[ $c ][ 0 ][ 0 ] = $x if $x < $b[ $c ][ 0 ][ 0 ];
$b[ $c ][ 0 ][ 1 ] = $x if $x > $b[ $c ][ 0 ][ 1 ];
$b[ $c ][ 1 ][ 0 ] = $y if $y < $b[ $c ][ 1 ][ 0 ];
$b[ $c ][ 1 ][ 1 ] = $y if $y > $b[ $c ][ 1 ][ 1 ];
}
return \@b;
}
sub buk {
my( $str ) = @_;
my @b = map[ ( 1e99, 0 ) x 2 ], 1 .. 256;
my( $i, $x, $y, $c ) = 0;
for my $c ( unpack 'C*', $$str ) {
$x = $i % $WIDTH;
$y = int( $i / $WIDTH );
$b[ $c ][ LEFT ] = $x if $x < $b[ $c ][ LEFT ];
$b[ $c ][ RIGHT ] = $x if $x > $b[ $c ][ RIGHT ];
$b[ $c ][ TOP ] = $y if $y < $b[ $c ][ TOP ];
$b[ $c ][ BOTTOM ] = $y if $y > $b[ $c ][ BOTTOM ];
++$i;
}
return \@b;
}
sub buk2{
my $str = shift;
my @b = map[ ( 1e99, 0 ) x 2 ], 1 .. 256;
for my $y ( 0 .. $HEIGHT-1 ) {
my $x = 0;
for my $c ( unpack'C*', substr $$str, $y * $WIDTH, $WIDTH ) {
$b[ $c ][ LEFT ] = $x if $x < $b[ $c ][ LEFT ];
$b[ $c ][ RIGHT ] = $x if $x > $b[ $c ][ RIGHT ];
$b[ $c ][ TOP ] = $y if $y < $b[ $c ][ TOP ];
$b[ $c ][ BOTTOM ] = $y if $y > $b[ $c ][ BOTTOM ];
++$x;
}
}
return \@b;
}
sub buk3{
my $str = shift;
my @b = map[ ( 1e99, 0 ) x 2 ], 1 .. 256;
for my $y ( 0 .. $HEIGHT-1 ) {
my $x = 0;
while( substr( $$str, $y * $WIDTH, $WIDTH ) =~ m[(([^\0])+)]g
+) {
my $c = ord($1); #, $-[0], $+[0];
$b[ $c ][ LEFT ] = $-[0] if $-[0] < $b[ $c ][ LEFT ];
$b[ $c ][ RIGHT ] = $+[0] if $+[0] > $b[ $c ][ RIGHT ];
$b[ $c ][ TOP ] = $y if $y < $b[ $c ][ TOP ];
$b[ $c ][ BOTTOM ] = $y if $y > $b[ $c ][ BOTTOM ];
++$x;
}
}
return \@b;
}
my $pdl = chr(0); $pdl x= ( $WIDTH * $HEIGHT );
my( $x, $y ) = ( $WIDTH/2, $HEIGHT/2 );
for my $c ( 1 .. 255 ) {
my $size = 3 + rand( 200 );
my $sizeDiv2 = int( ( $size+1 ) / 2 );
do{
( $x, $y ) = ( $sizeDiv2 + rand( $WIDTH - $size - 1 ), $sizeDi
+v2 + rand( $HEIGHT - $size - 1 ) )
} until substr( $pdl, $y * $WIDTH + $x, 1 ) eq chr( 0 );
redo unless makeObj( \$pdl, $x, $y, \$size, chr( $c ) );
}
my $start = time;
my $yr = yr \$pdl;
my $end = time;
printf "yr() took %.6f\n", $end - $start;
$start = time;
my $buk = buk \$pdl;
$end = time;
printf "buk() took %.6f\n", $end - $start;
$start = time;
my $buk2 = buk2 \$pdl;
$end = time;
printf "buk2() took %.6f\n", $end - $start;
$start = time;
my $buk3 = buk3 \$pdl;
$end = time;
printf "buk3() took %.6f\n", $end - $start;
#<STDIN>;
#pp $buk; pp $buk3;
`

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

In the absence of evidence, opinion is indistinguishable from prejudice.

Comment onRe^3: PDL: Looking for efficient way to extract sub-images, by finding bounding boxes of "objects" (7000x faster)SelectorDownloadCode