Re: (Efficiently) comparing arrays

Sometimes, the most naive approach is the best rewarding. Talking about speed, the FAQ approach, i.e. just iterating through the array items and returning at the first that doesn't match seems to be the fastest solution.
grep doesn't look good, because it will continue iterating even if the difference is in the first item.

#!/usr/bin/perl -w
use strict;
use Data::Compare;
use Benchmark;

sub aeq { # Zaxo
    my ($first, $second, @comp) = @_;
    return 0 if @{$first} != @{$second};
    @comp = grep {$first->[$_] ne $second->[$_]} 0..$#$first;
    return not @comp;
}

sub compare_arrays { # from FAQ
    my ($first, $second) = @_;
    return 0 unless @$first == @$second;
    for (my $i = 0; $i < @$first; $i++) {
        return 0 if $first->[$i] ne $second->[$i];
    }
    return 1;
}

my @first = (qw(a b c d e f), (1..200)) ;
my @second= (qw(b a c d e f), (1..200)) ;

timethese (20_000,
        {
         'grep' => sub {my $diff = aeq(\@first,\@second)},
         'faq' => sub {my $diff = compare_arrays(\@first,\@second)},
         'DComp' =>sub {my $diff = Compare(\@first,\@second)}
        });

__END__
Benchmark: timing 20000 iterations of DComp, grep, naif...
DComp:  2 wallclock secs ( 1.42 usr + 0.00 sys =  1.42 CPU)
 grep: 13 wallclock secs (13.13 usr + 0.00 sys = 13.13 CPU)
  faq:  1 wallclock secs ( 0.55 usr + 0.00 sys =  0.55 CPU)
[download]

update
That was the worst case for grep. However, in the best case, i.e. when the difference is at the end of the array, grep is the fastest approach, as you would have expected.
As a personal choice, I would use the FAQ approach, since it can guarantee acceptable results in both extreme cases.

my @first = ((1..200), qw(a b c d e f)) ;
my @second= ((1..200), qw(b a c d e f)) ;

Benchmark: timing 2000 iterations of DComp, faq, grep...
DComp: 12 wallclock secs (11.52 usr + 0.00 sys = 11.52 CPU)
  faq:  2 wallclock secs ( 1.48 usr + 0.00 sys =  1.48 CPU)
 grep:  1 wallclock secs ( 1.35 usr + 0.00 sys =  1.35 CPU)
[download]

update (2)
MeowChow's improvement over the FAQ's algorithm has the best performance in both extreme cases! Good shot!

 _  _ _  _  
(_|| | |(_|><
 _|

Comment on Re: (Efficiently) comparing arrays Select or Download Code

Replies are listed 'Best First'.

(MeowChow) Re2: (Efficiently) comparing arrays
by MeowChow (Vicar) on Apr 04, 2002 at 10:15 UTC

    my ($first, $second) = @_;
    return 0 if @$first != @$second;
    my $i = 0;
    $second->[$i++] ne $_ && return 0 for @$first;
    return 1;
[download]

   MeowChow                                   
               s aamecha.s a..a\u$&owag.print

[reply]
[d/l]

Re: Re: (Efficiently) comparing arrays
by cyberconte (Scribe) on Apr 04, 2002 at 10:03 UTC

sub maptest {
    my ($low, $high) = @_;
    my $i=0;

    return 0 if @$low != @$high;
    map{return 0 if $_ ne $low->[$i++]}@$high;
    return 1;
}
__END__

with 
my @first = ((1..200), qw(a b c d e f)) ;
my @second= ((1..200), qw(b a c d e f)) ;

Benchmark: timing 20000 iterations of faq, map...
       faq:  8 wallclock secs ( 8.44 usr +  0.00 sys =  8.44 CPU)
       map: 10 wallclock secs ( 9.64 usr +  0.00 sys =  9.64 CPU)
[download]

[reply]
[d/l]

(MeowChow) Re3: (Efficiently) comparing arrays

by MeowChow (Vicar) on Apr 04, 2002 at 10:19 UTC

map in void context

update: demerphq is also correct, removing the lexical context indeed contributes some of the performance increase. ~~Since this is a one-time performance penalty, its effect will be greater relative to smaller arrays.~~

   MeowChow                                   
               s aamecha.s a..a\u$&owag.print

[reply]

Re: (MeowChow) Re3: (Efficiently) comparing arrays

by demerphq (Chancellor) on Apr 04, 2002 at 14:09 UTC

just my $0.02

Yves / DeMerphq
---
Writing a good benchmark isnt as easy as it might look.

[reply]