Re: Duff's Device

It's meaningless in Perl, except as an anecdote, because the virtual machine has so many branches at the machine code level no matter what code you write, and no additional expensive penalty for jumping around in Perl code.

It's not just the jumping you save on, it's also the reduced testing. Duff's device does two things: it does loop unrolling, and it uses a neat/tricky way of dealing with the border case of a loop unrolling technique. Loop unrolling pays because of not having to evaluate a condition. Here's a benchmark showing duff's device to be significant faster than a plain loop (yes, the work done is contrived, but that's not the point):

#!/usr/bin/perl

use strict;
use warnings;

use Benchmark qw /timethese cmpthese/;

our ($x, $y);
our  $N = shift || 100;

cmpthese -1  => {
    noduff   => '$x = 0;
                 my $n = $N;
                 while ($n --) {$x ++}',
    duff     => '$y = 0;
                 my $n = int (($N + 7) / 8);
                 goto "LABEL" . ($N % 8);
                 {   LABEL0: $y ++;
                     LABEL7: $y ++;
                     LABEL6: $y ++;
                     LABEL5: $y ++;
                     LABEL4: $y ++;
                     LABEL3: $y ++;
                     LABEL2: $y ++;
                     LABEL1: $y ++;
                     redo if -- $n > 0
                  }',
};

die "Unequal \$N == $N; \$x == $x; \$y == $y.\n" unless $N == $x && $N
+ == $y;

__END__
          Rate noduff   duff
noduff 39384/s     --   -42%
duff   68267/s    73%     --
[download]

Abigail

Comment on Re: Duff's Device Download Code

Replies are listed 'Best First'.
Re: Re: Duff's Device by halley (Prior) on Feb 24, 2004 at 14:32 UTC
Right. The "edge case" is what fraction of the larger loop needs to be run, to emulate a smaller loop. As an analogy, look at running tracks. One track is a loop that is 0.25 miles long. Another track is 1.00 miles in a straight line. If you want to run 1.00 miles, this is really easy to figure out where you should start and stop on either track. If you want to run 1.25 miles, it's easy on the loop track, but you have to figure out where to start or stop on the line track. In CPU terms, leaving the track at an odd place requires more thought, more often, than entering the track at an odd place. So you calculate the fractional component once and jump into the race in the middle. -- `[ e d @ h a l l e y . c c ]`	[reply]
Re: Duff's Device by Abigail-II (Bishop) on Feb 24, 2004 at 14:43 UTC
I fail to understand what you are saying. Any speed gains of Duff's device aren't regarding the calculation of fractional components, or where to enter or leave the track at the odd place. That's only done once. It's more like running tracks where each time you pass the finish line you have to stop, and decrease the lap counter. Running 1 mile on the smaller track means having to stop 4 times to decrease the counter, while running on the longer track means stopping once. That's loop unrolling. The trick of Duff's device is doing that instead of running on big track for N miles, and a couple of runs on the small track to make up for the fractional distance, you start somewhere on the big track. It's the loop unrolling that saves time, it's Duff's trick of mixing switch/while that saves on tracks. Or rather, it saves on (duplicate) code. Abigail	[reply]
Re: Re: Duff's Device by demerphq (Chancellor) on Feb 24, 2004 at 18:13 UTC
So the important point here is that we only have to evaluate `if (--$n > 0)` one eighth of the time? --- demerphq _{First they ignore you, then they laugh at you, then they fight you, then you win. -- Gandhi}	[reply] [d/l] [select]
Re: Duff's Device by Abigail-II (Bishop) on Feb 24, 2004 at 19:43 UTC
So the important point here is that we only have to evaluate if (--$n > 0) one eighth of the time? To quote myself: Duff's device does two things: it does loop unrolling, and it uses a neat/tricky way of dealing with the border case of a loop unrolling technique. So, I'd say it's an important thing. Abigail	[reply]