comment on

Sometimes, the most naive approach is the best rewarding. Talking about speed, the FAQ approach, i.e. just iterating through the array items and returning at the first that doesn't match seems to be the fastest solution.
grep doesn't look good, because it will continue iterating even if the difference is in the first item.

#!/usr/bin/perl -w
use strict;
use Data::Compare;
use Benchmark;

sub aeq { # Zaxo
    my ($first, $second, @comp) = @_;
    return 0 if @{$first} != @{$second};
    @comp = grep {$first->[$_] ne $second->[$_]} 0..$#$first;
    return not @comp;
}

sub compare_arrays { # from FAQ
    my ($first, $second) = @_;
    return 0 unless @$first == @$second;
    for (my $i = 0; $i < @$first; $i++) {
        return 0 if $first->[$i] ne $second->[$i];
    }
    return 1;
}

my @first = (qw(a b c d e f), (1..200)) ;
my @second= (qw(b a c d e f), (1..200)) ;

timethese (20_000,
        {
         'grep' => sub {my $diff = aeq(\@first,\@second)},
         'faq' => sub {my $diff = compare_arrays(\@first,\@second)},
         'DComp' =>sub {my $diff = Compare(\@first,\@second)}
        });

__END__
Benchmark: timing 20000 iterations of DComp, grep, naif...
DComp:  2 wallclock secs ( 1.42 usr + 0.00 sys =  1.42 CPU)
 grep: 13 wallclock secs (13.13 usr + 0.00 sys = 13.13 CPU)
  faq:  1 wallclock secs ( 0.55 usr + 0.00 sys =  0.55 CPU)
[download]

update
That was the worst case for grep. However, in the best case, i.e. when the difference is at the end of the array, grep is the fastest approach, as you would have expected.
As a personal choice, I would use the FAQ approach, since it can guarantee acceptable results in both extreme cases.

my @first = ((1..200), qw(a b c d e f)) ;
my @second= ((1..200), qw(b a c d e f)) ;

Benchmark: timing 2000 iterations of DComp, faq, grep...
DComp: 12 wallclock secs (11.52 usr + 0.00 sys = 11.52 CPU)
  faq:  2 wallclock secs ( 1.48 usr + 0.00 sys =  1.48 CPU)
 grep:  1 wallclock secs ( 1.35 usr + 0.00 sys =  1.35 CPU)
[download]

update (2)
MeowChow's improvement over the FAQ's algorithm has the best performance in both extreme cases! Good shot!

 _  _ _  _  
(_|| | |(_|><
 _|

In reply to Re: (Efficiently) comparing arrays by gmax
in thread comparing arrays by cyberconte

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.