Thanks, nice find. By the way, perhaps I've chosen imprecise word ("sliding"). Maybe in math/science, any function, applied to overlapping infixes, is said to be applied in sliding/moving/rolling manner. What I meant instead, with the word, -- "applied using computationally efficient/cheap algorithm".
Looks like conv2d does very honest (therefore not efficient/cheap) 4-nested-loops style calculation, in PP/XS/C -- OK for very small kernels. Here's overcrowded plot, but B and E cases would suffice to show they are the same breed.
sub sms_WxH_PDL_conv2d ( $m, $w, $h ) { $m -> conv2d( ones( $w, $h )) -> slice( [floor(($w-1)/2),floor(-($w+1)/2)], [floor(($h-1)/2),floor(-($h+1)/2)] ) } __END__ Time (s) vs. N (NxN submatrix, PDL: Double D [1500,1500] matrix) +-----------------------------------------------------------+ 1 |-+ + + + + + + + + +-| | A | | E | | | | | | | 0.8 |-+ +-| | B | | | | | | | 0.6 |-+ A +-| | E | | | | | | | 0.4 |-+ B +-| | E | | | | A | | E B | 0.2 |-+ +-| | E B | | A D | | D D E D D D D D | | E E C C C C C C | 0 |-+ B C +-| | + + + + + + + + | +-----------------------------------------------------------+ 2 4 6 8 10 12 14 16 sms_WxH_PDL_naive A sms_WxH_pdlpp_4loops B sms_WxH_PDL_lags C sms_WxH_PDL_sliding D sms_WxH_PDL_conv2d E +----+-------+-------+-------+-------+-------+ | N | A | B | C | D | E | +----+-------+-------+-------+-------+-------+ | 2 | 0.061 | 0.009 | 0.030 | 0.083 | 0.020 | | 3 | 0.120 | 0.019 | 0.013 | 0.073 | 0.042 | | 4 | 0.252 | 0.030 | 0.039 | 0.098 | 0.066 | | 6 | 0.566 | 0.078 | 0.028 | 0.078 | 0.141 | | 8 | 0.963 | 0.139 | 0.033 | 0.081 | 0.237 | | 10 | | 0.248 | 0.031 | 0.078 | 0.366 | | 12 | | 0.388 | 0.033 | 0.078 | 0.531 | | 16 | | 0.728 | 0.041 | 0.069 | 0.928 | +----+-------+-------+-------+-------+-------+
In reply to Re^2: Fast sliding submatrix sums with PDL (inspired by PWC 248 task 2)
by Anonymous Monk
in thread Fast sliding submatrix sums with PDL (inspired by PWC 248 task 2)
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |