Re: Why this code is so slow if run in thread?

Sorry for my earlier misdirection. By way of recompense I have what I believe (though it is essentially untested for lack of a suitable image), that addresses both the slowness of substr on utf strings within threads (which is just weird) and the problem I thought was the cause, that of cloning the returned array.

It avoids the former by doing away with the encoding, searching instead for runs of pairs of non-null characters in the unencoded pdl; and the latter by accumulating the counts in a packed binary array stored in a scalar.

sub test {
    my $fn  = shift;
    my $img = PDL::IO::Image-> new_from_file( $fn ) or die "Failed to 
+load image";
    my $pdl = $img->pixels_to_pdl->short;
    my $s   = cc8compt( $pdl != 0 );
    my $str = ${ $s-> get_dataref };

    my ( $w, $h ) = $s-> dims;

    my $bounds = pack 'n4', $w, 0, $h, 0;
    $bounds x= $s->max;

    for my $y ( 0 .. $h - 1 ) {
        my $s = substr( $str, 2 * $y * $w, 2 * $w );
        while( $s =~ m[(?:[^\0][^\0])+]g ) {
            my( $l, $r ) = ( $-[0]/2, (($+[0])-1)/2 );
            my $c = ord( $& );
            vec( $bounds, 4*$c+0, 16 ) = $l if $l < vec( $bounds, 4*$c
++0, 16 );
            vec( $bounds, 4*$c+1, 16 ) = $r if $r > vec( $bounds, 4*$c
++1, 16 );
            vec( $bounds, 4*$c+2, 16 ) = $y if $y < vec( $bounds, 4*$c
++2, 16 );
            vec( $bounds, 4*$c+3, 16 ) = $y if $y > vec( $bounds, 4*$c
++3, 16 );
        }
    }
    return $bounds;
}
[download]

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.

In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Re: Why this code is so slow if run in thread? Download Code

Replies are listed 'Best First'.
Re^2: Why this code is so slow if run in thread? by vr (Curate) on Dec 12, 2016 at 07:44 UTC
Here's a link: https://drive.google.com/open?id=0Bxkg1eqq0xXxVjFhSzg0cHRkQkE, if anyone wants to run these tests. Sorry for delays with feedback. Thank you everyone for answers, and, BrowserUk, for code. I will test it later.	[reply]
Re^3: Why this code is so slow if run in thread? by BrowserUk (Patriarch) on Dec 12, 2016 at 08:45 UTC
Results of my workaround on your test image: `C:\test>1177606 vrtest.png C:\test>1177606 vrtest.png No thread --------- Took:0.818306923 Count: 145 Thread --------- Took:2.834208012 Count: 145` [download] With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity. In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^4: Why this code is so slow if run in thread? by vr (Curate) on Dec 12, 2016 at 11:00 UTC
You are right about "substr" being, unexpectedly, too slow with utf strings and threads. This program takes 12 seconds on my machine (I wanted some Greek letters, but it looks they are replaced with ugly codes. I think the idea is clear): `use utf8; use threads; threads-> create( sub { $s = 'αβγδ' x 1000_000; substr( $s, 0, 1000 ) for 1 .. 1000; })-> join; print time - $^T;` [download] But then simple solution will be to, first, get a substring, and only then decode it. I.e. to move "decode" into loop. Then everything works as expected.	[reply] [d/l]
Re^5: Why this code is so slow if run in thread? by BrowserUk (Patriarch) on Dec 12, 2016 at 12:01 UTC
Re^6: Why this code is so slow if run in thread? by vr (Curate) on Dec 12, 2016 at 12:11 UTC
Some notes below your chosen depth have not been shown here