in reply to How to speed up a nested loop?

I can't imagine the following refactoring will make much difference, but it may help a little:

my $bxScaled = $bx * 16; my $byScaled = $by * 16; for my $x (0 .. 15) { my $xScaled = 16 * $x; for my $y (0 .. 15) { # this calculates the tile index we are currently at, # using the x and y coords in this block my $tile_index = $y + $xScaled; # this calculates the real x and y values of this tile my $rx = $bxScaled + $x; my $ry = $byScaled + $y; # insert type data into the internal map array and # note that there was change in this block if applicable my $tile = $tiles[$rx][$ry][$bz][type]; if (!defined $tile || $$tile != $type_data[$tile_index]) { $changed = 1; $$tile = $type_data[$tile_index]; } } }

Main changes are to reduce the number of multiplies performed in the inner loop and to reduce the number of array indexing operations.

Update: A quick benchmark indicates about a 40% speed improvement. I'm surprised!

use strict; use warnings; use Benchmark qw(cmpthese); use constant type => 0; cmpthese (-1, { org => 'org ()', gf => 'gf ()', }); sub org { my $bz = 0; my $bx = 0; my $by = 0; my $changed; my @tiles; my @type_data; for my $y (0 .. 15) { for my $x (0 .. 15) { # this calculates the tile index we are currently at, # using the x and y coords in this block my $tile_index = $y + ($x * 16); # this calculates the real x and y values of this tile my $rx = ($bx * 16) + $x; my $ry = ($by * 16) + $y; # insert type data into the internal map array and # note that there was change in this block if applicable if (!defined $tiles[$rx][$ry][$bz][type] || $tiles[$rx][$ry][$bz][type] != $type_data[$tile_ind +ex]) { $changed = 1; $tiles[$rx][$ry][$bz][type] = $type_data[$tile_index]; } } } } sub gf { my $bz = 0; my $bx = 0; my $by = 0; my $changed; my @tiles; my @type_data; my $bxScaled = $bx * 16; my $byScaled = $by * 16; for my $x (0 .. 15) { my $xScaled = 16 * $x; for my $y (0 .. 15) { # this calculates the tile index we are currently at, # using the x and y coords in this block my $tile_index = $y + $xScaled; # this calculates the real x and y values of this tile my $rx = $bxScaled + $x; my $ry = $byScaled + $y; # insert type data into the internal map array and # note that there was change in this block if applicable my $tile = $tiles[$rx][$ry][$bz][type]; if (!defined $tile || $$tile != $type_data[$tile_index]) { $changed = 1; $$tile = $type_data[$tile_index]; } } } } __END__ Rate org gf org 1372/s -- -29% gf 1939/s 41% --

Update: a further small improvement (about 10%) is realized by moving the $rx calculation to the outer loop.

Adding my $xTiles = $tiles[$rx] to the outer loop and using my $xTiles in place of $tiles[$rx] in the inner loop gives another 2% improvement (although that's getting to be in the benchmark noise and may be bogus).


Perl reduces RSI - it saves typing

Replies are listed 'Best First'.
Re^2: How to speed up a nested loop?
by tilly (Archbishop) on Sep 17, 2008 at 23:04 UTC
    You can factor more work out of the inner loop, but I would expect the win to be very minor:
    my $bxScaled = $bx * 16; my $byScaled = $by * 16; for my $x (0 .. 15) { my $xScaled = 16 * $x; # These calculations were factored out of the inner loop my $rx = $bxScaled + $x; my $tiles_for_rx = $tiles[$rx]; for my $y (0 .. 15) { # this calculates the tile index we are currently at, # using the x and y coords in this block my $tile_index = $y + $xScaled; # this calculates the real y value of this tile my $ry = $byScaled + $y; # insert type data into the internal map array and # note that there was change in this block if applicable my $tile = $tiles_for_rx->[$ry][$bz][type]; if (!defined $tile || $$tile != $type_data[$tile_index]) { $changed = 1; $$tile = $type_data[$tile_index]; } } }
    Incidentally I'm puzzled at the array access of type. Should that be a $type? Or perhaps a hash access? Either way the current version looks like a bug.

    Incidentally another cause of your win over the original is that you're using lexical variables consistently, which I believe are slightly faster than package variables.

      I assumed type was a constant. It may be a transcription error of some sort. I can't see it making any difference to the results of refactoring however.

      Don't we always use lexical variables and strictures? ;) The OP may be interested in reading The strictures, according to Seuss.


      Perl reduces RSI - it saves typing
        I am using warnings and strict, as the link to the .pl file in the op shows. Explanation as to what's going on with that though is below. :)
      Your hint has some truth. I should probably refactor my tiles array so i can get a reference to $tiles$bz$rx and go on from there. However you should probably read up about references a bit, as your code would never work as it is there. :) As for type, that is indeed a constant. Hashes are too slow, thus i used constants to build pseudo-hashes.
        What are you talking about? If Grandfather's code works, then my code should work as well because it is just a trivial refactoring. If it does not work, then I guarantee that it is the result of a silly typo and not a result of my not understanding how to use references in Perl. If I've done something you don't understand, the odds are higher that I understand something about references in Perl that you don't, than the reverse.

        If you doubt my competence, I suggest that you browse 10 or 20 of my best nodes. (You might want to skip the first couple, at least half of the top 10 don't have real programming meat.)

        On using type as a constant, the usual convention is that constants should be ALL_CAPS. It doesn't matter to Perl, but you are violating expectations there.

Re^2: How to speed up a nested loop?
by juster (Friar) on Sep 18, 2008 at 05:47 UTC

    I played with this for entirely too long and got about a 8% improvement on tilly's according to Benchmark (on linux with an old Pentium M). This was mostly by changing $tile_index to a counter instead of finding it by addition each time. I also made a lookup table for bx/byScaled but this did little.

    While doing this I got to wondering about the code snippet at the end:

    my $tile = $tiles[$rx][$ry][$bz][type]; if (!defined $tile || $$tile != $type_data[$tile_index]) { $changed = 1; $$tile = $type_data[$tile_index]; }

    which I don't understand. If $tile is undefined where does $$tile go to? If $$tile goes nowhere how come there's no warning or anything?

    I combined everyone's code together in one large benchmark:

    Thanks for the introduction to Benchmark and to the OP for an interesting node.

    Update: after looking at the code I posted I may have screwed up the order of comparisons and I'm too tired to fix it... but the point was... using an iterator.

    You could even have the x,y,z stored linearly in an array which would reduce the loops and have the same results. Then you could have an array of arrays (because you need a sub-array for the type element right?) and just iterate over with one loop instead of two.

    IE a matrix like this:

    (1 2 3) (4 5 6) ===> ( 1 2 3 4 5 6 7 8 9 ) (7 8 9) how to do in three dimensions, I dunno...
      Tilly's code is missing a few \s and ${}s to be able to work.

      Also, thank you, the counter for the tile actually got it down to .046 over the .048 from GF. :)
      As for storing it linearly ... Honestly i don't think it's worth the effort. This stuff is already difficult enough to understand and keep in my mind as it is and i'm not *that* desperate for performance. ^^

      If you're curious, here's a flow diagram.

      *Update*: These things actually reduced performance:
      my $rx = $bxScaled[$x]; my $ry = $byScaled[$y];
      I guess the simple additions there are faster than array lookups.
Re^2: How to speed up a nested loop?
by Xenofur (Monk) on Sep 18, 2008 at 08:10 UTC
    With your changes i'm at the following code and have a real life application improvement from 0.055 seconds for a whole "syncToDF" loop down to 0.049, which is already pretty respectable. Additionally you've helped show me where i need to keep my eyes open. Thanks a lot! :D
    my ($tile_index,$rx,$ry,$y,$x,$xScaled); my $bxScaled = $bx * 16; my $byScaled = $by * 16; for $x ( 0..15 ) { $rx = $bxScaled+$x; $xScaled = 16 * $x; for $y ( 0..15 ) { $tile_index = $y+$xScaled; $ry = $byScaled+$y; if ( !defined $tiles[$rx][$ry][$bz][type] || $tiles[$rx][$ry][$bz][type] != $type_data[$tile_index] ) { $changed = 1; $tiles[$rx][$ry][$bz][type] = $type_data[$tile_index]; } } }
    Now to see whether there's some more good hints. :)