in reply to Matching arrays
Using a join is much faster.
update (1). Provided that the arrays are both sorted and contiguous. Thanks robartes for pointing it out.
update (2). A modified routine, also using join solves both the above problems. (see below)
#!/usr/bin/perl -w use strict; use Benchmark qw(cmpthese timethese); my $num_elem = 10_000; my $count = 500; my @elements = map {$_} (1..$num_elem), (qw(10001 10002 10003 10004 10 +005)); my @search = qw( 10001 10002 10003 ); # ------------------------------------------ sub match_arrays{ # gmax my ($elem, $find) = @_; my $str_find = join "/", @$find; return join('/',@$elem) =~ /$str_find/; } #------------------------------------------- #print match_arrays(\@elements, \@search),"\n"; sub contains { # robartes my $array=shift; my $container=shift; my %container=map { $_ => undef } @$container; foreach (@$array) { return 0 unless (exists $container{$_}); } return 1; } sub in_ordered_set { # demerphq my ($s,$a)=@_; return 1 unless @$s; return 0 unless @$a; my ($i,$j)=(0,0); while ($i<@$s) { while ($j<@$a) { last if $a->[$j] == $s->[$i]; return 0 if $a->[$j]>$s->[$i]; $j++; } $j++; $i++; } return 1 } my $result = timethese($count, { 'join' => sub {match_arrays(\@elements, \@search)}, 'hash' => sub {contains (\@search, \@elements)}, 'set' => sub {in_ordered_set (\@search, \@elements)} }); cmpthese ($result); __END__ Array elements: 1000 Benchmark: timing 2000 iterations of hash, join, set... hash: 4 wallclock secs ( 4.34 usr + 0.02 sys = 4.36 CPU) join: 0 wallclock secs ( 0.47 usr + 0.00 sys = 0.47 CPU) set: 4 wallclock secs ( 3.64 usr + 0.00 sys = 3.64 CPU) Rate hash set join hash 459/s -- -17% -89% set 549/s 20% -- -87% join 4255/s 828% 674% -- Array elements: 10000 Benchmark: timing 500 iterations of hash, join, set... hash: 15 wallclock secs (14.72 usr + 0.06 sys = 14.78 CPU) join: 1 wallclock secs ( 1.26 usr + 0.01 sys = 1.27 CPU) set: 10 wallclock secs ( 9.04 usr + 0.02 sys = 9.06 CPU) Rate hash set join hash 33.8/s -- -39% -91% set 55.2/s 63% -- -86% join 394/s 1064% 613% --
update (2). This modified sub can also find a match when the arrays are not sorted, with almost the same efficiency.
sub match_arrays2{ # gmax my ($elem, $find) = @_; my @regex_find = map {qr /#$_#/} @$find; my @match = (); my $elem_join = '#' .join('#',@$elem) . "#"; for (@regex_find) { push @match, undef if $elem_join =~ /$_/ ; } return @match == @$find; } __END__ Benchmark: timing 500 iterations of hash, join, join2, set... hash: 15 wallclock secs (14.66 usr + 0.06 sys = 14.72 CPU) join: 1 wallclock secs ( 1.27 usr + 0.00 sys = 1.27 CPU) join2: 2 wallclock secs ( 1.27 usr + 0.00 sys = 1.27 CPU) set: 10 wallclock secs ( 9.89 usr + 0.04 sys = 9.93 CPU) Rate hash set join join2 hash 34.0/s -- -33% -91% -91% set 50.4/s 48% -- -87% -87% join 394/s 1059% 682% -- 0% join2 394/s 1059% 682% 0% --
_ _ _ _ (_|| | |(_|>< _|
|
|---|