vr has asked for the wisdom of the Perl Monks concerning the following question:

use strict; use warnings; use feature 'say'; use PDL; use PDL::Image2D; use PDL::IO::Image; # sample: http://image.ibb.co/i6Qj76/test171217.png my $fn = 'test171217.png'; my $pdl = PDL::IO::Image-> new_from_file( $fn )-> pixels_to_pdl-> shor +t; say $pdl-> info; my $segmented = cc8compt( $pdl ); say $segmented-> info; say $segmented-> max;

Function cc8compt in PDL 2.015, that shipped with Strawberry 5.24.0, returns piddle of same data format as its argument, i.e. e.g. short for short. Later versions appear to always return long, and therefore the described bug can not be observed.

Running the script several times, I'm getting something like this:

D:\>perl 171217.pl PDL: Short D [1200,3950] PDL: Short D [1200,3950] 32756 D:\>perl 171217.pl PDL: Short D [1200,3950] PDL: Short D [1200,3950] 32764 D:\>perl 171217.pl PDL: Short D [1200,3950] PDL: Short D [1200,3950] 32736 D:\>perl 171217.pl PDL: Short D [1200,3950] PDL: Short D [1200,3950] 32756 D:\>perl 171217.pl PDL: Short D [1200,3950] PDL: Short D [1200,3950] 32760

The result is unpredictable, and I'm puzzled why -- even if Google says that signed integer overflow is "undefined behaviour", but shouldn't result be deterministic, at least? If I'm wrong, what exactly is happenning, during program execution? And BTW, changing short to long, in code, leads to correct (?) maximum count of 28299, value below maximum positive short, so it's unclear why any problems happen at all (I'm not brave enough to look into PDL source).

I also was trying to use simple sample pdl, e.g. a kind of checker-board, but with that, no problem like described above.

  • Comment on [maybe OT] What kind of bug was that? Non-deterministic result with C integer overflow?
  • Select or Download Code

Replies are listed 'Best First'.
Re: [maybe OT] What kind of bug was that? Non-deterministic result with C integer overflow?
by Anonymous Monk on Dec 17, 2017 at 15:19 UTC
    Overflow is "undefined" in the sense that it might give you a different result on different platforms.

    My money is on this being an array bounds error in PDL's connected-component labeling routine. That's an interesting topic, check out the link.

Re: [maybe OT] What kind of bug was that? Non-deterministic result with C integer overflow?
by Laurent_R (Canon) on Dec 18, 2017 at 00:04 UTC
    The result is unpredictable, and I'm puzzled why -- even if Google says that signed integer overflow is "undefined behaviour", but shouldn't result be deterministic, at least?
    Not necessarily. I don't know exactly what is happening here, but you might just get some value from a memory address that has not been properly initialized, and could therefore contain anything just being there when you run your program. This may not be very common in Perl, but it does happen in C programs. Just my two cents worth, I don't know enough about PDL to be more helpful.
Re: [maybe OT] What kind of bug was that? Non-deterministic result with C integer overflow?
by Anonymous Monk on Dec 18, 2017 at 13:51 UTC

    It is an array bounds error. Take a look at PDL-2.016/Changes:

    * Bugs fixed:
        ...
        414   ccNcompt (i.e. cc4compt and cc8compt) breaks with byte data type
    
    Which refers to this bug.

    The algorithm allocates an equiv[] array to store label equivalence lists. When your running label wraps (a short int), it becomes a (small) negative value. This in turn is used as an index to chase the equiv[] list. In other words, you are accessing memory before the allocated object. Perhaps it would hang or fault if heap poisoning were applied.

    Count of 28299 is smaller than maximum short value after label equivalences are removed.

      Thank you for answers. A bit carried away with "non-determinism", it's just random data in unrelated memory. And now I see why I couldn't repeat this bug with checker-board pattern -- rather something like right-leaning triangles are required:

      my $bin = ( xvals( short, 1024, 400 ) % 4 + yvals( short, 1024, 400 ) % 4 * 2 ) > 6; say $bin-> info; say $bin-> slice( ([ 0, 7 ]) x 2 ); say $bin-> cc8compt-> max; say $bin-> long-> cc8compt-> max;

      PDL: Short D [1024,400] [ [0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0] [0 0 0 1 0 0 0 1] [0 1 1 1 0 1 1 1] [0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0] [0 0 0 1 0 0 0 1] [0 1 1 1 0 1 1 1] ] 31344 25600
      I will never forget the very first time that this same (type of) bug bit me in the ass – in another context and in another life, long ago.