 Your skill will accomplishwhat the force of many cannot PerlMonks

PDL: Looking for efficient way to extract sub-images, by finding bounding boxes of "objects"

by vr (Curate)
 on Nov 18, 2016 at 11:35 UTC Need Help??

vr has asked for the wisdom of the Perl Monks concerning the following question:

For a kind of OCR-ish task, I need to find bboxes of all "objects" in 2D image. Then I can use slicing to extract sub-images as virtual piddles.

Suppose there's an image with 3 "objects":

pdl> p \$x [ [0 0 0 0 0 0] [1 0 0 0 0 0] [1 0 0 0 1 0] [1 0 1 1 1 0] [1 0 0 1 0 0] [1 0 0 0 0 1] [0 0 0 0 1 1] ]

And segmented:

pdl> \$s = cc8compt \$x pdl> p \$s [ [0 0 0 0 0 0] [1 0 0 0 0 0] [1 0 0 0 2 0] [1 0 2 2 2 0] [1 0 0 2 0 0] [1 0 0 0 0 3] [0 0 0 0 3 3] ]

That's quite fast. And then:

p Dumper map { \$obj = whichND( \$s == \$_ ); [ [ \$obj-> slice( 0 )-> minmax ], [ \$obj-> slice( 1 )-> minmax ], ] } 1 .. \$s-> max; \$VAR1 = [ [ 0, 0 ], [ 1, 5 ] ]; \$VAR2 = [ [ 2, 4 ], [ 2, 4 ] ]; \$VAR3 = [ [ 4, 5 ], [ 5, 6 ] ];

That gives bboxes in data structure I want. But it's too slow - tens of seconds for some-megapixel image and hundred of "objects", before I even began to use these data.

I wonder if it can be faster. Maybe to skip creating of a mask (\$s == \$_), i.e. hundred of them. Plus, the whichND gives a piddle of _all_ indexes - when all I want is a bounding box. Maybe there's a way to e.g. efficiently collapse a dimension looking for required index in segmented image (\$s, above). I feel I'm missing something obvious. Any ideas?

• Comment on PDL: Looking for efficient way to extract sub-images, by finding bounding boxes of "objects"

Replies are listed 'Best First'.
Re: PDL: Looking for efficient way to extract sub-images, by finding bounding boxes of "objects"
by BrowserUk (Patriarch) on Nov 18, 2016 at 12:17 UTC

I suspect this is one of those cases where vector operations don't buy you anything; indeed, they probably force a lot of extra work.

Once you have your "segmented image" -- ie. each object colored differently -- then a single left to right, top-to bottom pixel scan will discover the bounding boxes of all the objects.

In pseudo code:

use constant { LEFT=>0, RIGHT=>1, TOP=>2, BOTTOM=>3 }; my @bounds; for my \$y ( 0 .. \$height -1 ){ for my \$x ( 0 .. \$width - 1 ) { my \$pixel = \$image->pixel( \$x, \$y ); \$bounds[ \$pixel ][ LEFT ] = \$x if \$x < ( \$bounds[ \$pixel ][ +LEFT ] // 1e99 ); \$bounds[ \$pixel ][ RIGHT ] = \$x if \$x > ( \$bounds[ \$pixel ][ +RIGHT ] // 0 ); \$bounds[ \$pixel ][ TOP ] = \$y if \$y < ( \$bounds[ \$pixel ][ +TOP ] // 1e99 ); \$bounds[ \$pixel ][ BOTTOM ] = \$y if \$y > ( \$bounds[ \$pixel ][ +BOTTOM ] // 0 ); } }

Unfortunately I cannot help you with applying that to PDL because I gave up on trying to find an efficient way to apply a piece of perl code to all the elements of a piddle.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
In the absence of evidence, opinion is indistinguishable from prejudice.

You are right, I was hoping to stay (do all work) in PDL domain, i.e. maybe I overlooked function e.g. similar to CORE::index or even List::Util::first.

Now my solution is this:

# \$s is our segmented image use integer; my ( \$w, \$h ) = \$s-> dims; my \$str = \${ \$s-> byte-> get_dataref }; my @b = map { [ [ \$w, 0 ], [ \$h, 0 ] ] } 1 .. \$s-> max; while ( \$str =~ /[^\x00]/g ) { my \$i = pos( \$str ) - 1; my \$x = \$i % \$w; my \$y = \$i / \$w; my \$c = ord( \$& ) - 1; \$b[ \$c ][ 0 ][ 0 ] = \$x if \$x < \$b[ \$c ][ 0 ][ 0 ]; \$b[ \$c ][ 0 ][ 1 ] = \$x if \$x > \$b[ \$c ][ 0 ][ 1 ]; \$b[ \$c ][ 1 ][ 0 ] = \$y if \$y < \$b[ \$c ][ 1 ][ 0 ]; \$b[ \$c ][ 1 ][ 1 ] = \$y if \$y > \$b[ \$c ][ 1 ][ 1 ]; }

It's fast, and will get slightly more complex for 2-byte "characters", if number of "objects" is more than 255.

Btw, this single scan with regex engine is twice (or _only_ twice?) as fast as multiple scans with index and rindex (actually 4 scans per "object", on concatenated rows and concatenated columns).

p.s. But, ehm, direct per-pixel scan (accessing \$s-> at( \$x, \$y )) is 5 times slower than PDL-only solution.

buk2() below is about 50% faster than buk(), but the best is buk3() which 36x faster than buk() and a cool 7000 times faster than yr(). (That astounded me, and I didn't believe it at first, but it's true!):

C:\test>1176081 -WIDTH=1000 -HEIGHT=1000 yr() took 295.895124 buk() took 1.483303 buk2() took 1.113015 buk3() took 0.042507

And a run on a 10kx10k "image" (without yr() as it would take days.):

C:\test>1176081 -WIDTH=10000 -HEIGHT=10000 yr() took 0.000003 buk() took 150.481585 buk2() took 102.911382 buk3() took 3.710700

The test is only a crude simulation, so things may not pan out quite so well with the real data, but it worth a look :)

My test harness:

#! perl -slw use strict; use Time::HiRes qw[ time ]; use Data::Dump qw[ pp ]; use constant { LEFT=>0, RIGHT=>1, TOP=>2, BOTTOM=>3 }; our \$WIDTH //= 1000; our \$HEIGHT //= 1000; sub makeObj{ my( \$img, \$x, \$y, \$size, \$c ) = @_; for my \$y1 ( \$y - ( \$\$size / 2 ) .. \$y + ( \$\$size / 2 ) ) { return () unless substr( \$\$img, \$y1*\$WIDTH + \$x-((\$\$size+1)/2) +, \$\$size ) = chr(0)x(\$\$size); } for my \$y1 ( \$y - ( \$\$size / 2 ) .. \$y + ( \$\$size / 2 ) ) { substr( \$\$img, \$y1 * \$WIDTH + \$x-((\$\$size+1)/2), \$\$size+2 ) = +\$c x(\$\$size+2); } return 1; } sub yr { # use integer; ## using int() below seemed faster than this. my \$str = shift; my @b = map { [ [ \$WIDTH, 0 ], [ \$HEIGHT, 0 ] ] } 1 .. 256; #\$s-> +max; while ( \$\$str =~ /[^\x00]/g ) { my \$i = pos( \$\$str ) - 1; my \$x = \$i % \$WIDTH; my \$y = int( \$i / \$WIDTH ); my \$c = ord( \$& ) - 1; \$b[ \$c ][ 0 ][ 0 ] = \$x if \$x < \$b[ \$c ][ 0 ][ 0 ]; \$b[ \$c ][ 0 ][ 1 ] = \$x if \$x > \$b[ \$c ][ 0 ][ 1 ]; \$b[ \$c ][ 1 ][ 0 ] = \$y if \$y < \$b[ \$c ][ 1 ][ 0 ]; \$b[ \$c ][ 1 ][ 1 ] = \$y if \$y > \$b[ \$c ][ 1 ][ 1 ]; } return \@b; } sub buk { my( \$str ) = @_; my @b = map[ ( 1e99, 0 ) x 2 ], 1 .. 256; my( \$i, \$x, \$y, \$c ) = 0; for my \$c ( unpack 'C*', \$\$str ) { \$x = \$i % \$WIDTH; \$y = int( \$i / \$WIDTH ); \$b[ \$c ][ LEFT ] = \$x if \$x < \$b[ \$c ][ LEFT ]; \$b[ \$c ][ RIGHT ] = \$x if \$x > \$b[ \$c ][ RIGHT ]; \$b[ \$c ][ TOP ] = \$y if \$y < \$b[ \$c ][ TOP ]; \$b[ \$c ][ BOTTOM ] = \$y if \$y > \$b[ \$c ][ BOTTOM ]; ++\$i; } return \@b; } sub buk2{ my \$str = shift; my @b = map[ ( 1e99, 0 ) x 2 ], 1 .. 256; for my \$y ( 0 .. \$HEIGHT-1 ) { my \$x = 0; for my \$c ( unpack'C*', substr \$\$str, \$y * \$WIDTH, \$WIDTH ) { \$b[ \$c ][ LEFT ] = \$x if \$x < \$b[ \$c ][ LEFT ]; \$b[ \$c ][ RIGHT ] = \$x if \$x > \$b[ \$c ][ RIGHT ]; \$b[ \$c ][ TOP ] = \$y if \$y < \$b[ \$c ][ TOP ]; \$b[ \$c ][ BOTTOM ] = \$y if \$y > \$b[ \$c ][ BOTTOM ]; ++\$x; } } return \@b; } sub buk3{ my \$str = shift; my @b = map[ ( 1e99, 0 ) x 2 ], 1 .. 256; for my \$y ( 0 .. \$HEIGHT-1 ) { my \$x = 0; while( substr( \$\$str, \$y * \$WIDTH, \$WIDTH ) =~ m[(([^\0])+)]g +) { my \$c = ord(\$1); #, \$-, \$+; \$b[ \$c ][ LEFT ] = \$- if \$- < \$b[ \$c ][ LEFT ]; \$b[ \$c ][ RIGHT ] = \$+ if \$+ > \$b[ \$c ][ RIGHT ]; \$b[ \$c ][ TOP ] = \$y if \$y < \$b[ \$c ][ TOP ]; \$b[ \$c ][ BOTTOM ] = \$y if \$y > \$b[ \$c ][ BOTTOM ]; ++\$x; } } return \@b; } my \$pdl = chr(0); \$pdl x= ( \$WIDTH * \$HEIGHT ); my( \$x, \$y ) = ( \$WIDTH/2, \$HEIGHT/2 ); for my \$c ( 1 .. 255 ) { my \$size = 3 + rand( 200 ); my \$sizeDiv2 = int( ( \$size+1 ) / 2 ); do{ ( \$x, \$y ) = ( \$sizeDiv2 + rand( \$WIDTH - \$size - 1 ), \$sizeDi +v2 + rand( \$HEIGHT - \$size - 1 ) ) } until substr( \$pdl, \$y * \$WIDTH + \$x, 1 ) eq chr( 0 ); redo unless makeObj( \\$pdl, \$x, \$y, \\$size, chr( \$c ) ); } my \$start = time; my \$yr = yr \\$pdl; my \$end = time; printf "yr() took %.6f\n", \$end - \$start; \$start = time; my \$buk = buk \\$pdl; \$end = time; printf "buk() took %.6f\n", \$end - \$start; \$start = time; my \$buk2 = buk2 \\$pdl; \$end = time; printf "buk2() took %.6f\n", \$end - \$start; \$start = time; my \$buk3 = buk3 \\$pdl; \$end = time; printf "buk3() took %.6f\n", \$end - \$start; #<STDIN>; #pp \$buk; pp \$buk3;

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
In the absence of evidence, opinion is indistinguishable from prejudice.

You might try this. It expects a reference to the raw string, and the width & height available in globals -- adapt to your preferences:

sub buk { my( \$str ) = @_; my @b = map[ ( 1e99, 0 ) x 2 ], 1 .. 256; #\$s-> max; my( \$i, \$x, \$y, \$c ) = 0; for my \$c ( unpack 'C*', \$\$str ) { \$x = \$i % \$WIDTH; \$y = int( \$i / \$WIDTH ); \$b[ \$c ][ LEFT ] = \$x if \$x < \$b[ \$c ][ LEFT ]; \$b[ \$c ][ RIGHT ] = \$x if \$x > \$b[ \$c ][ RIGHT ]; \$b[ \$c ][ TOP ] = \$y if \$y < \$b[ \$c ][ TOP ]; \$b[ \$c ][ BOTTOM ] = \$y if \$y > \$b[ \$c ][ BOTTOM ]; ++\$i; } return \@b; }

In my simulated tests, it performs 224 times faster than your posted code on a 1000x1000x1byte image and 1700 times faster on a 2000x2000:

C:\test>1176081 -WIDTH=999 -HEIGHT=999 yr() took 329.728896 buk() took 1.474515 C:\test>1176081 -WIDTH=2000 -HEIGHT=2000 yr() took 10265.728694 buk() took 5.976180

It is also trivially adaptable to images that use short/word/doubleword/quadword sized pixels.

I have a couple of ideas for a faster version, which I'll post if they pan out.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
In the absence of evidence, opinion is indistinguishable from prejudice.
Re: PDL: Looking for efficient way to extract sub-images, by finding bounding boxes of "objects" (BAD is good for you!)
by vr (Curate) on Sep 26, 2017 at 00:09 UTC
I'm missing something obvious

I was! Yesterday I accidentally found this nice recipe. The key to avoiding calling whichND hundreds or thousands of times is in using functions qsorti (sort 1D array, but return rather an array of indexes into original array), one2nd (use shape of invocant to convert 1D array of indexes to list of N arrays of indexes, i.e. per coordinate (dimension)), and rle. The one2nd appears to destroy its argument, use with caution. Maybe a demonstration will help:

pdl> p \$im [ [0 0 1 1 0 0] [2 0 0 1 0 0] [2 2 0 0 0 3] [0 0 0 0 3 0] ] pdl> p \$im-> flat-> qsort # Won't be used. Just a demo. [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 2 2 2 3 3] pdl> p \$idx = \$im-> flat-> qsorti [15 16 14 10 11 21 23 20 18 19 4 5 0 1 7 8 3 2 9 13 6 12 22 17] pdl> p +( \$runs ) = \$im-> flat-> qsort -> rle [16 3 3 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] pdl> ( \$x, \$y ) = \$im-> one2nd( \$idx ) pdl> p \$x [3 4 2 4 5 3 5 2 0 1 4 5 0 1 1 2 3 2 3 1 0 0 4 5] pdl> p \$y [2 2 2 1 1 3 3 3 3 3 0 0 0 0 1 1 0 0 1 2 1 2 3 2] pdl> p \$idx # NB. Destroyed! [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] pdl>

The trailing zeroes in \$runs are to be killed manually, per documentation. So then, \$runs holds lengths of intervals in \$x and \$y lists, belonging to the same "object" in an image. So it's easy to slice these \$x and \$y, and find min and max for each "object", i.e. "bounding boxes".

Recipe claims to be 60 times faster than multiple whichND's for some sample image. IIRC, approx the same advantage BrowserUK's solution, fastest to date, had over naive straightforward PDL code (i.e. "multiple whichND's"). In short, it's worth a try. The test subject will be an image from related thread. Code using regexp engine is almost verbatim from "trusted and tested" sub used "in production" for a year, based on BrowserUK's code.

use strict; use warnings; use feature 'say'; use PDL; use PDL::NiceSlice; use PDL::IO::Image; use PDL::Image2D; use PDL::ImageND; use Encode 'decode'; use Benchmark 'cmpthese'; use Data::Dump; my \$fn = 'test.png'; my \$img = PDL::IO::Image-> new_from_file( \$fn ); my \$pdl = \$img-> pixels_to_pdl-> short; my \$sgm = cc8compt( \$pdl == 0 )-> short; # "Short" in line above is required for PDL that ships # with Strawberry 5.26. Otherwise \$sgm gets long. Wasn't # so in 5.24. Not nice. # \$sgm ('segmented image') will be a test subject. cmpthese( 5, { pdl_naive => \&get_bboxes_pdl_naive, regex => \&get_bboxes_regex, pdl => \&get_bboxes_pdl, }); sub get_bboxes_pdl_naive { return [ map { my \$obj = whichND( \$sgm == \$_ ); [ [ \$obj-> slice( 0 )-> minmax ], [ \$obj-> slice( 1 )-> minmax ], ] } 1 .. \$sgm-> max ] } sub get_bboxes_regex { my ( \$w, \$h ) = \$sgm-> dims; my \$str = \${ \$sgm-> get_dataref }; my @b = map { [ [ \$w, 0 ], [ \$h, 0 ] ] } 0 .. \$sgm-> max; \$w *= 2; for my \$y ( 0 .. \$h - 1 ) { my \$s = decode 'UTF16LE', substr( \$str, \$y * \$w, \$w ); while( \$s =~ m/[^\0]+/g ) { my \$c = ord( \$& ); \$b[ \$c ] = \$- if \$- < \$b[ \$c ]; \$b[ \$c ] = \$+ - 1 if \$+ - 1 > \$b[ \$c ]; \$b[ \$c ] = \$y if \$y < \$b[ \$c ]; \$b[ \$c ] = \$y if \$y > \$b[ \$c ]; } } shift @b; return \@b } sub get_bboxes_pdl { my \$idx = \$sgm-> flat-> qsorti; my ( \$runs ) = \$sgm-> flat-> index( \$idx )-> rle; \$runs = \$runs-> where( \$runs != 0 )-> cumusumover; my ( \$x, \$y ) = \$sgm-> one2nd( \$idx ); return [ map { my \$start = \$runs( \$_ - 1 ); my \$stop = \$runs( \$_ ) - 1; [[ \$x( \$start:\$stop )-> minmax ], [ \$y( \$start:\$stop )-> minmax ]] } 1 .. \$runs-> nelem - 1 ] }

s/iter pdl_naive regex pdl pdl_naive 5.53 -- -73% -81% regex 1.51 266% -- -29% pdl 1.07 417% 41% --

Of course, 60 or 6 times faster depends on machine, image, etc. But nice. Finally, fast Perl solution that doesn't use regular expressions :-).

Though it can be much faster. Most of the image is uninteresting background (zeroes). I certainly don't need X and Y coords for every pixel of background. As, currently, \$x and \$y hold them. And what about sorting? What if zeroes are replaced with bad values? I suspect sort will be faster, because it just "pushes" them to the end of array. While zeroes are still honestly "sorted".

use strict; use warnings; use PDL; use Benchmark 'cmpthese'; my \$x = (( random 10_000_000 ) * 2**16 )-> ushort; my \$bads = \$x-> setbadif( \$x > 2**13 ); # 88% is dull 'background' my \$zeroes = \$bads-> setbadtoval( 0 ); cmpthese( 5, { zeroes => sub { \$zeroes-> qsort }, bads => sub { \$bads-> qsort }, });

So then:

sub get_bboxes_pdl_bad { my \$flat = \$sgm-> flat-> setvaltobad( 0 ); my \$idx = \$flat-> qsorti; \$idx = \$idx-> where( \$flat-> index( \$idx )-> isgood ); my ( \$runs ) = \$flat-> index( \$idx )-> rle; \$runs = \$runs-> where( \$runs != 0 )-> cumusumover - 1; my ( \$x, \$y ) = \$sgm-> one2nd( \$idx ); my \$stop = -1; return [ map { my \$start = \$stop + 1; \$stop = \$_; [[ \$x( \$start:\$stop )-> minmax ], [ \$y( \$start:\$stop )-> minmax ]] } \$runs-> list ] }

s/iter regex pdl pdl_bad regex 1.51 -- -30% -87% pdl 1.06 42% -- -81% pdl_bad 0.203 641% 421% --

Update: to explain my recollections about regex solution being several tens of times faster than "naive" PDL, and to explain regex's results from a year ago to be faster with, actually, the same image (about 15 times faster on same hardware), -- a byte string was used back then. I.e. if number of "objects" is known to be limited to 255, then regular expressions are very fast. And preferred, with regard to speed. As soon as string becomes Unicode string, as above, -- the speed drops significantly, and nothing helps.

Plus, to make "pdl_bad" solution approx. 25% faster, the 2 lines calculating \$idx can be replaced with

my \$idx = \$flat-> qsorti-> slice([ 0, \$flat-> ngood - 1 ]);

Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1176081]
Approved by marto
Front-paged by Discipulus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2022-01-18 21:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
In 2022, my preferred method to securely store passwords is:

Results (54 votes). Check out past polls.

Notices?