Re: On optimizing nested loops

Your first code sample might run faster if you rewrote it this way:

  OUTER: foreach my $record (@in) {
    foreach my $field (keys %{$where}) {
      next OUTER unless $record->{$field} eq $where->{$field};
    }
    push @out, $record;
  }
[download]

But I cannot be sure and cannot test, not having the data.

Update: I just tried to compare the two approaches with a far simpler data set (so it might not be really significant to your data).

With your initial algorithm:

$ time perl -e '  my @out = ();

  for my $record (0..10000000) {
    my $keep = 1;
    foreach my $field (qw /80000/) {
      unless ($record == $field) {
        $keep = 0;
        last;
      }
      push @out, $record if $keep;
    }
   }

  print "@out", "\n";'
80000

real    0m4.260s
user    0m4.227s
sys     0m0.015s
[download]

With the changes I suggested above:

$ time perl -e '  my @out = ();

  OUTER: for my $record (0..10000000) {
    foreach my $field (80000) {
      next OUTER unless ($record == $field);
    }
    push @out, $record ;
  }

  print "@out", "\n";'
80000

real    0m2.761s
user    0m2.745s
sys     0m0.000s
[download]

So the suggested change appear to bring an improvement.

But of course, the inner loop is useless in this case, since the inner list has only one item. So removing the loop and comparing directly will give an idea of the cost of this loop having to be set up ten million times:


  OUTER: for my $record (0..10000000) {
      next OUTER unless ($record == 80000);
      push @out, $record ;
  }

  print "@out", "\n";'
80000

real    0m0.854s
user    0m0.841s
sys     0m0.000s
[download]

This seems to confirm what you were saying about the cost of building a foreach loop a very large number of times.

Comment on Re: On optimizing nested loops Select or Download Code