in reply to Re: Fast sliding submatrix sums with PDL (inspired by PWC 248 task 2)
in thread Fast sliding submatrix sums with PDL (inspired by PWC 248 task 2)
Thanks, nice find. By the way, perhaps I've chosen imprecise word ("sliding"). Maybe in math/science, any function, applied to overlapping infixes, is said to be applied in sliding/moving/rolling manner. What I meant instead, with the word, -- "applied using computationally efficient/cheap algorithm".
Looks like conv2d does very honest (therefore not efficient/cheap) 4-nested-loops style calculation, in PP/XS/C -- OK for very small kernels. Here's overcrowded plot, but B and E cases would suffice to show they are the same breed.
sub sms_WxH_PDL_conv2d ( $m, $w, $h ) { $m -> conv2d( ones( $w, $h )) -> slice( [floor(($w-1)/2),floor(-($w+1)/2)], [floor(($h-1)/2),floor(-($h+1)/2)] ) } __END__ Time (s) vs. N (NxN submatrix, PDL: Double D [1500,1500] matrix) +-----------------------------------------------------------+ 1 |-+ + + + + + + + + +-| | A | | E | | | | | | | 0.8 |-+ +-| | B | | | | | | | 0.6 |-+ A +-| | E | | | | | | | 0.4 |-+ B +-| | E | | | | A | | E B | 0.2 |-+ +-| | E B | | A D | | D D E D D D D D | | E E C C C C C C | 0 |-+ B C +-| | + + + + + + + + | +-----------------------------------------------------------+ 2 4 6 8 10 12 14 16 sms_WxH_PDL_naive A sms_WxH_pdlpp_4loops B sms_WxH_PDL_lags C sms_WxH_PDL_sliding D sms_WxH_PDL_conv2d E +----+-------+-------+-------+-------+-------+ | N | A | B | C | D | E | +----+-------+-------+-------+-------+-------+ | 2 | 0.061 | 0.009 | 0.030 | 0.083 | 0.020 | | 3 | 0.120 | 0.019 | 0.013 | 0.073 | 0.042 | | 4 | 0.252 | 0.030 | 0.039 | 0.098 | 0.066 | | 6 | 0.566 | 0.078 | 0.028 | 0.078 | 0.141 | | 8 | 0.963 | 0.139 | 0.033 | 0.081 | 0.237 | | 10 | | 0.248 | 0.031 | 0.078 | 0.366 | | 12 | | 0.388 | 0.033 | 0.078 | 0.531 | | 16 | | 0.728 | 0.041 | 0.069 | 0.928 | +----+-------+-------+-------+-------+-------+
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^3: Fast sliding submatrix sums with PDL (inspired by PWC 248 task 2)
by wlmb (Novice) on Dec 27, 2023 at 23:30 UTC | |
by Anonymous Monk on Dec 28, 2023 at 11:35 UTC |