Using a join is much faster.

update (1). Provided that the arrays are both sorted and contiguous. Thanks robartes for pointing it out.

update (2). A modified routine, also using join solves both the above problems. (see below)

#!/usr/bin/perl -w use strict; use Benchmark qw(cmpthese timethese); my $num_elem = 10_000; my $count = 500; my @elements = map {$_} (1..$num_elem), (qw(10001 10002 10003 10004 10 +005)); my @search = qw( 10001 10002 10003 ); # ------------------------------------------ sub match_arrays{ # gmax my ($elem, $find) = @_; my $str_find = join "/", @$find; return join('/',@$elem) =~ /$str_find/; } #------------------------------------------- #print match_arrays(\@elements, \@search),"\n"; sub contains { # robartes my $array=shift; my $container=shift; my %container=map { $_ => undef } @$container; foreach (@$array) { return 0 unless (exists $container{$_}); } return 1; } sub in_ordered_set { # demerphq my ($s,$a)=@_; return 1 unless @$s; return 0 unless @$a; my ($i,$j)=(0,0); while ($i<@$s) { while ($j<@$a) { last if $a->[$j] == $s->[$i]; return 0 if $a->[$j]>$s->[$i]; $j++; } $j++; $i++; } return 1 } my $result = timethese($count, { 'join' => sub {match_arrays(\@elements, \@search)}, 'hash' => sub {contains (\@search, \@elements)}, 'set' => sub {in_ordered_set (\@search, \@elements)} }); cmpthese ($result); __END__ Array elements: 1000 Benchmark: timing 2000 iterations of hash, join, set... hash: 4 wallclock secs ( 4.34 usr + 0.02 sys = 4.36 CPU) join: 0 wallclock secs ( 0.47 usr + 0.00 sys = 0.47 CPU) set: 4 wallclock secs ( 3.64 usr + 0.00 sys = 3.64 CPU) Rate hash set join hash 459/s -- -17% -89% set 549/s 20% -- -87% join 4255/s 828% 674% -- Array elements: 10000 Benchmark: timing 500 iterations of hash, join, set... hash: 15 wallclock secs (14.72 usr + 0.06 sys = 14.78 CPU) join: 1 wallclock secs ( 1.26 usr + 0.01 sys = 1.27 CPU) set: 10 wallclock secs ( 9.04 usr + 0.02 sys = 9.06 CPU) Rate hash set join hash 33.8/s -- -39% -91% set 55.2/s 63% -- -86% join 394/s 1064% 613% --

update (2). This modified sub can also find a match when the arrays are not sorted, with almost the same efficiency.

sub match_arrays2{ # gmax my ($elem, $find) = @_; my @regex_find = map {qr /#$_#/} @$find; my @match = (); my $elem_join = '#' .join('#',@$elem) . "#"; for (@regex_find) { push @match, undef if $elem_join =~ /$_/ ; } return @match == @$find; } __END__ Benchmark: timing 500 iterations of hash, join, join2, set... hash: 15 wallclock secs (14.66 usr + 0.06 sys = 14.72 CPU) join: 1 wallclock secs ( 1.27 usr + 0.00 sys = 1.27 CPU) join2: 2 wallclock secs ( 1.27 usr + 0.00 sys = 1.27 CPU) set: 10 wallclock secs ( 9.89 usr + 0.04 sys = 9.93 CPU) Rate hash set join join2 hash 34.0/s -- -33% -91% -91% set 50.4/s 48% -- -87% -87% join 394/s 1059% 682% -- 0% join2 394/s 1059% 682% 0% --
_ _ _ _ (_|| | |(_|>< _|

In reply to Re: Matching arrays by gmax
in thread Matching arrays by arrow

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.