in reply to Reflections on graphic screwing!

Here is a PDL version, that still has loops (gasp) but uses Judd Taylor's excellent PDL binding for GD. It goes about 5 times quicker than the version in the OP (edit to note: actually, for the fancy -M=4 -P=1.5 -S=200 -L=10 one, the OP takes about 70s, this PDL one takes 7s, so 10x quicker):
#! perl -sw use strict; use PDL; use PDL::Constants qw(PI DEGRAD); use PDL::IO::GD; use constant FNAME => 'CGScrewPDL'; use constant { COS30 => 0.86602540378443864676372317075294, TAN30 => 0.57735026918962576450914878050196, }; use enum qw[ X Y A ]; ## Construct thread profile segment sub constructThreadProfile { my @points; my( $dia, $pitch, $yInc ) = @_; my $H = $pitch * COS30; # draw flat crest my $x = $dia / 2; my $y = -$yInc; my $yLim = $pitch / 16; push @points, [ $x, $y += $yInc, 1 ] while $y < $yLim; ## draw upper 30deg flank. my $xd = $yInc / TAN30; $yLim = $y + 5 / 16 * $pitch; push @points, [ $x -= $xd, $y += $yInc, 0.5 ] while $y < $yLim; ## draw root radius $yLim = $y + 4 / 16 * $pitch; my( $cx, $cy, $r ) = ( ( $dia/2 - 7/8*$H + $H/3 ), $pitch / 2, $H +/ 6 ); while( $y < $yLim ) { my $dy = $cy - $y; my $dx = sqrt( $r**2 - $dy**2 ); push @points, [ $cx - $dx, $y, $dx / $r ]; $y += $yInc; } $y -= $yInc; ## draw lower 30deg flank $yLim = $y + 5 / 16 * $pitch; push @points, [ $x += $xd, $y += $yInc, - 0.5 ] while $y < $yLim; $yLim = $pitch; push @points, [ $x, $y += $yInc, 1 ] while $y < $yLim; return pdl \@points; } our $M //= 10; our $P //= 1.5; our $L //= 2; our $S //= 100; my $fname = sprintf "%sM%.2fxP%.2fxL%.2fxS%d.png", FNAME, $M, $P, $L, +$S; my $profilePDL = constructThreadProfile( $M, $P, 1 / ( 10* $S ) ); my( $w, $h ) = ( $M * $S + 200, int( $L / $P + 1 ) * $P * $S + 200 ); my $xc = $w / 2; my $imPDL = zeroes(byte, $w, $h) + 128; # no RGB as all same, dummy at + end my $yTrans = $P / 360; my $maxRad = $M * $S / 2; my $yOff = 100; my $turns = sequence(indx, $L / $P); for my $turn ( $turns->list ) { for my $p ( map $_/2, -$maxRad*2 .. $maxRad*2 ) { my $rot = atan2( sqrt( $maxRad**2 - $p**2 ), $p ); my $this_profile = $profilePDL + [0, $yTrans * DEGRAD * $rot, +0]; $this_profile *= [cos($rot), 1, cos( PI/2 - $rot)]; my ($newX, $newY) = ($xc + $this_profile->slice([X,0,0]) * $S, + $yOff + $this_profile->slice([Y,0,0]) * $S); my $color = $this_profile->slice([A,0,0])->abs * 256 + (100 - +$newY); $imPDL->indexND( $newX->cat($newY)->mv(-1,0) ) .= $color; } $yOff += $P * $S; } my $im = PDL::IO::GD->new({pdl=> $imPDL->dummy(2,3)}); $im->String( gdFontGetSmall, 0,0, $fname, 0 ); $im->write_Png($fname);
You can see I eliminated the "Z" enum and entry in the "profile" and "xformPoints" (inlined as a couple of PDL operations here). That made the original run about 10% faster. Top tip for programmers! If you don't want to waste 45 minutes cursing your existence in confusion, if you eliminate part of a fixed-size tuple, make sure you eliminate it in all places. Unexpected results will happen otherwise!

Replies are listed 'Best First'.
Re^2: Reflections on graphic screwing!
by etj (Priest) on Sep 09, 2024 at 00:16 UTC
    Here is a PDL version I like more:
    #! perl -sw use strict; use PDL; use PDL::Constants qw(PI DEGRAD); use PDL::IO::GD; use constant FNAME => 'CGScrewPDL'; use constant { COS30 => 0.86602540378443864676372317075294, TAN30 => 0.57735026918962576450914878050196, }; use enum qw[ X Y A ]; $|=1; ## Construct thread profile segment sub constructThreadProfile { my ($dia, $pitch, $yInc) = @_; my $H = $pitch * COS30; # draw flat crest my $x = $dia / 2; my @yLim = (0, $pitch / 16); my $yQuant = ($yLim[-1]-$yLim[-2]) / $yInc; my $ySeq = zeroes($yQuant)->xlinvals(0, $yLim[-1]); my $topcrest = cat(pdl($x), $ySeq, pdl(1))->mv(-1,0); ## draw upper 30deg flank. my $xd = $yInc / TAN30; push @yLim, $yLim[-1] + 5 / 16 * $pitch; $yQuant = ($yLim[-1]-$yLim[-2]) / $yInc; my $xSeq = zeroes($yQuant)->xlinvals($x, $x - $xd*$yQuant); $ySeq = zeroes($yQuant)->xlinvals($yLim[-2], $yLim[-1]); my $topflank = cat($xSeq, $ySeq, pdl(0.5))->mv(-1,0); ## draw root radius push @yLim, $yLim[-1] + 4 / 16 * $pitch; my ($cx, $cy, $r) = ( ( $dia/2 - 7/8*$H + $H/3 ), $pitch / 2, $H / + 6 ); $yQuant = ($yLim[-1]-$yLim[-2]) / $yInc; $ySeq = zeroes($yQuant)->xlinvals($yLim[-2], $yLim[-1]); my $dy = $cy - $ySeq; my $dx = sqrt( $r**2 - $dy**2 ); my $rootradius = cat($xSeq = $cx - $dx, $ySeq, $dx / $r)->mv(-1,0) +; $x = $xSeq->at(-1); ## draw lower 30deg flank push @yLim, $yLim[-1] + 5 / 16 * $pitch; $yQuant = ($yLim[-1]-$yLim[-2]) / $yInc; $xSeq = zeroes($yQuant)->xlinvals($x, $x + $xd*$yQuant); $ySeq = zeroes($yQuant)->xlinvals($yLim[-2], $yLim[-1]); my $botflank = cat($xSeq, $ySeq, pdl(-0.5))->mv(-1,0); $x = $xSeq->at(-1); ## the bottom crest push @yLim, $pitch; $yQuant = ($yLim[-1]-$yLim[-2]) / $yInc; $ySeq = zeroes($yQuant)->xlinvals($yLim[-2], $yLim[-1]); my $botcrest = cat(pdl($x), $ySeq, pdl(1))->mv(-1,0); $topcrest->glue(1, $topflank, $rootradius, $botflank, $botcrest); } our $M //= 10; our $P //= 1.5; our $L //= 2; our $S //= 100; my $profilePDL = constructThreadProfile( $M, $P, 1 / ( 10* $S ) ); my ($w, $h) = ( $M * $S + 200, int( $L / $P + 1 ) * $P * $S + 200 ); my $imPDL = zeroes(byte, $w, $h) + 128; # no RGB as all same, dummy at + end my $xc = $w / 2; my $yTrans = $P / 360; my $maxRad = $M * $S / 2; my $yOffs = 100 + sequence($L / $P) * $P * $S; for my $yOff ( $yOffs->list ) { print "*"; my $p = zeroes(4*$maxRad+1)->xlinvals(-$maxRad, $maxRad); # 4mR my $rot = atan2(sqrt($maxRad**2 - $p**2), $p)->dummy(0); # 1 4mR my $this_profile = $profilePDL # 3 pE + cat(pdl(0), $yTrans*DEGRAD*$rot, pdl(0))->mv(-1,0) # 3 1 4 +mR ; # 3 pE 4 +mR $this_profile *= cat(cos($rot), pdl(1), cos( PI/2 - $rot))->mv(-1, +0); my $newX = $xc + $this_profile->slice([X,0,0]) * $S; my $newY = $yOff + $this_profile->slice([Y,0,0]) * $S; my $color = $this_profile->slice([A,0,0])->abs * 256 + (100 - $new +Y); $imPDL->indexND( $newX->cat($newY)->mv(-1,0) ) .= $color; } print "\n"; my $fname = sprintf "%sM%.2fxP%.2fxL%.2fxS%d.png", FNAME, $M, $P, $L, +$S; my $im = PDL::IO::GD->new({pdl=> $imPDL->dummy(2,3)}); $im->String( gdFontGetSmall, 0,0, $fname, 0 ); $im->write_Png($fname);
    For the bigger screws it runs about the same speed on my machine, but the CPU is thrashing away at over 100%, as even with the smaller one, with 1500 profile rows and 2000 maxRad entries, that means >1e6 elements, and that's when PDL's auto-pthreading kicks in. That almost certainly means that on a less puny machine it will run rather quickly.

    I didn't broadcast over the yOffs, even though that's obviously possible, because it's already quite bulky in memory. Comments very welcome!

    The two main lessons I got from this were:

    • using pdl instead of cat is a mistake for putting bigger data together, because even though cat insists on all ndarrays as input, it will broadcast even little ones rather than just silently filling in missing stuff with $PDL::undefval (often 0)
    • the technique of commenting the dimensions of initially-incompatible ndarrays makes it very very easy to figure out where to put dummy dimensions or move them, which is otherwise very difficult