Re: Fastest way to test membership in a constant set
by BrowserUk (Patriarch) on Jul 26, 2007 at 14:40 UTC
|
Use a restricted hash and use exists directly, rather than via a subroutine. This will be far faster than any other method.
#! perl -slw
use strict;
use Hash::Util qw[ lock_hash ];
my %set = map{ $_ => 1 } qw[ age old years ];
lock_hash %set;
for my $unknown ( qw[ age sex old young years minutes] ) {
print "$unknown ",
exists $set{ $unknown }
? 'exists'
: ' does not exist';
}
$set{ fred } = 1;
__END__
c:\test>junk3
age exists
sex does not exist
old exists
young does not exist
years exists
minutes does not exist
Attempt to access disallowed key 'fred' in a restricted hash at c:\tes
+t\junk3.pl line 15.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
Re: Fastest way to test membership in a constant set
by Corion (Patriarch) on Jul 26, 2007 at 14:31 UTC
|
You don't show us how using a global hash fails for you, but here is an approach:
use strict;
use vars qw(%IsMember);
%IsMember = ('age' => 1, 'old' => 1, 'years' => 1);
Use it as:
if ($IsMember{age}) { print "'age' is a member\n" };
if ($IsMember{ego}) { print "'ego' is a member\n" };
For development, you can lock the hash keys by using Tie::Hash::FixedKeys. | [reply] [d/l] [select] |
|
|
Why "use vars qw(%IsMember)" instead of my %IsMember or our %IsMember?
| [reply] |
|
|
| [reply] [d/l] [select] |
Re: Fastest way to test membership in a constant set (don't use a hash)
by grinder (Bishop) on Jul 26, 2007 at 22:24 UTC
|
The set is small? And constant? Forget the hash, you have fallen into an XY trap.
If they contain only alphabetical characters (or some subset) then it is a trivial matter to come up with a character that is unused by any of the set members.
Build a string by joining the values using the delimiter character, and bracket the beginning and end of the string. You can then use index to determine membership. If the set is small enough, this will be faster than any hash or regexp solution.
my $delim = ",";
my $set = $delim . join($delim, keys %set) . $delim;
sub IsMember {
return index($set, "$delim$_[0]$delim") != -1;
}
• another intruder with the mooring in the heart of the Perl
| [reply] [d/l] |
|
|
#! perl -slw
use strict;
use Hash::Util qw[ lock_hash unlock_hash ];
use Benchmark qw[ cmpthese ];
our $ITERS ||= -1;
our @items = qw[ age old years ];
our @tests = ( @items, map{ $_ . 'x' } @items );
our %set = map{ $_ => 1 } @items; lock_hash %set;
our $set = $; . join( $;, @items ) . $;;
cmpthese $ITERS, {
hash => q[
my $count;
exists $set{ $_ } and ++$count for @tests;
print "hash: $count" if $ITERS == 1;
],
index => q[
my $count;
1+index $set, $; . $_ . $; and ++$count for @tests;
print "index: $count" if $ITERS == 1;
],
};
@items = map{ $_ x 5 } 'a' .. 'z';
@tests = ( @items, map{ $_ . 'x' } @items );
unlock_hash %set;
%set = map{ $_ => 1 } @items; lock_hash %set;
$set = $; . join( $;, @items ) . $;;
cmpthese $ITERS, {
hash => q[
my $count;
exists $set{ $_ } and ++$count for @tests;
print "hash: $count" if $ITERS == 1;
],
index => q[
my $count;
1+index $set, $; . $_ . $; and ++$count for @tests;
print "index: $count" if $ITERS == 1;
],
};
__END__
## Check they produce teh same results:
C:\test>junk3 -ITERS=1
hash: 3
(warning: too few iterations for a reliable count)
index: 3
(warning: too few iterations for a reliable count)
Rate hash index
hash 1000000000000000/s -- 0%
index 1000000000000000/s 0% --
hash: 26
(warning: too few iterations for a reliable count)
index: 26
(warning: too few iterations for a reliable count)
Rate hash index
hash 1000000000000000/s -- 0%
index 1000000000000000/s 0% --
## See how fast
C:\test>junk3
Rate index hash
index 166820/s -- -31%
hash 240174/s 44% --
Rate index hash
index 18301/s -- -59%
hash 44377/s 142% --
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
|
|
I had to try your tests with bitmaps. Not as elegant as hashes
but a bit faster.
#! /usr/bin/perl -slw
use strict;
use Hash::Util qw[ lock_hash unlock_hash ];
use Benchmark qw[ cmpthese ];
our $ITERS ||= -1;
our @items = qw[ age old years ];
our @tests = ( @items, map{ $_ . 'x' } @items );
our $a=0b00000000000000000000000001;
our $b=0b00000000000000000000000010;
our $c=0b00000000000000000000000100;
our $d=0b00000000000000000000001000;
our $e=0b00000000000000000000010000;
our $f=0b00000000000000000000100000;
our $g=0b00000000000000000001000000;
our $h=0b00000000000000000010000000;
our $i=0b00000000000000000100000000;
our $j=0b00000000000000001000000000;
our $k=0b00000000000000010000000000;
our $l=0b00000000000000100000000000;
our $m=0b00000000000001000000000000;
our $n=0b00000000000010000000000000;
our $o=0b00000000000100000000000000;
our $p=0b00000000001000000000000000;
our $q=0b00000000010000000000000000;
our $r=0b00000000100000000000000000;
our $s=0b00000001000000000000000000;
our $t=0b00000010000000000000000000;
our $u=0b00000100000000000000000000;
our $v=0b00001000000000000000000000;
our $w=0b00010000000000000000000000;
our $x=0b00100000000000000000000000;
our $y=0b01000000000000000000000000;
our $z=0b10000000000000000000000000;
our @myitems= (
$a, $b, $c,
);
our @mytest= ( @myitems, $d, $e, $f);
our $myset= $a | $b | $c;
our %set = map{ $_ => 1 } @items; lock_hash %set;
our $set = $; . join( $;, @items ) . $;;
cmpthese $ITERS, {
hash => q[
my $count;
exists $set{ $_ } and ++$count for @tests;
print "hash: $count" if $ITERS == 1;
],
index => q[
my $count;
1+index $set, $; . $_ . $; and ++$count for @tests;
print "index: $count" if $ITERS == 1;
],
bitmap => q[
my $count;
$myset & $_ and ++$count for @mytest;
print "bitmap: $count" if $ITERS == 1;
],
};
@items = map{ $_ x 5 } 'a' .. 'z';
@tests = ( @items, map{ $_ . 'x' } @items );
unlock_hash %set;
%set = map{ $_ => 1 } @items; lock_hash %set;
$set = $; . join( $;, @items ) . $;;
@myitems= (
$a, $b, $c, $d, $e, $f, $g, $h, $i, $j,
$k, $l, $m, $n, $o, $p, $q, $r, $s, $t,
$u, $v, $w, $x, $y, $z,
);
@mytest= ( @myitems, map { 0b0 } @myitems);
$myset= 0b11111111111111111111111111;
cmpthese $ITERS, {
hash => q[
my $count;
exists $set{ $_ } and ++$count for @tests;
print "hash: $count" if $ITERS == 1;
],
index => q[
my $count;
1+index $set, $; . $_ . $; and ++$count for @tests;
print "index: $count" if $ITERS == 1;
],
bitmap => q[
my $count;
$myset & $_ and ++$count for @mytest;
print "bitmap: $count" if $ITERS == 1;
],
};
Results on my machine:
Rate index hash bitmap
index 210823/s -- -51% -57%
hash 432785/s 105% -- -13%
bitmap 494700/s 135% 14% --
Rate index hash bitmap
index 20276/s -- -73% -81%
hash 75918/s 274% -- -28%
bitmap 105217/s 419% 39% --
| [reply] [d/l] [select] |
|
|
|
|
|
|
my $set = $delim . join($delim, keys %set) . $delim;
When I need to join and have the delimeter also at the beginning or ending or both, I use empty strings:
my $set = join( $delim, "", keys %set, "" );
Just another way to do it. | [reply] [d/l] [select] |
Re: Fastest way to test membership in a constant set
by runrig (Abbot) on Jul 26, 2007 at 16:53 UTC
|
You don't need a readonly hash if you just use a closure...no code outside the block can change the hash: BEGIN {
my %members;
undef @members{qw(age old years)};
sub is_member {
return exists $members{$_[0]};
}
}
| [reply] [d/l] |
Re: Fastest way to test membership in a constant set
by FunkyMonk (Bishop) on Jul 26, 2007 at 14:27 UTC
|
| [reply] |
Re: Fastest way to test membership in a constant set
by Anonymous Monk on Jul 26, 2007 at 15:24 UTC
|
To answer your questions: 1. I could use non-XS Readonly for non-production (or Tie::Hash::FixedKeys as well). The desire for constness comes from wanting to write code that is self-documenting and abuse-preventing. In our environment, hacks and cut-n-paste from production code can (and does) happen. I want to minimize the hacks that will eventually occur but at the same time maximize speed. Sadly, I don't think my version of perl supports Hash::Util.
| [reply] |
|
|
Sadly, I don't think my version of perl supports Hash::Util
You're using older than 5.8.0? Isn't that at least 5 years old?
I'm not sure, and a quick scan of perldelta didn't tell me, when Internals first appeared, but it can do the same job with a minor additional amount of work. Once you have built your hash, you have to set the values readonly as well as the hash itself. Hardly onerous though:
#! perl -slw
use strict;
use Internals qw[ SetReadOnly ];
my %set = map{ $_ => 1 } qw[ age old years ];
SetReadOnly( \$_ ) for values %set; ## Freeze values
SetReadOnly \%set; ## and keys.
for my $unknown ( qw[ age sex old young years minutes] ) {
print "$unknown ",
exists $set{ $unknown }
? 'exists'
: ' does not exist';
}
eval{ $set{ age } = 123 } or print $@;
eval{ $set{ fred } = 1 } or print $@;
__END__
C:\test>junk3
age exists
sex does not exist
old exists
young does not exist
years exists
minutes does not exist
Modification of a read-only value attempted at C:\test\junk3.pl line 1
+7.
Attempt to access disallowed key 'fred' in a restricted hash at C:\tes
+t\junk3.pl line 18.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] |
Re: Fastest way to test membership in a constant set
by Cabrion (Friar) on Jul 27, 2007 at 11:09 UTC
|
Bear with me because this is going to seem silly at first. What about a completely hard coded approach? From the original post, the goal seems to be speed in addtion to being constant. Why bother using hashes at all?
The following subroutine should be quite a bit faster than what you have, thought I admit I haven't compared the two.
sub IsMemeber {
if $_[0] eq 'age' return 1;
if $_[0] eq 'old' return 1;
if $_[0] eq 'years' return 1;
return 0;
}
You could probably speed up the code even more by in-lining the if statements instead of enduring the overhead of a function call.
| [reply] [d/l] |
|
|
This is fine if you have just a few keys. However, this approach is
O(n) (i.e. execution time increases linearly with the number of items
in the set), whereas a hash has approximately constant lookup time — or O(1).
Even for only 10 items, a hash is considerably faster:
#!/usr/bin/perl -w
my $nkeys = 10;
my @keys;
our $key; # (keep last key for lookup below)
for (1..$nkeys) {
# generate random keys consisting of 3..10 chars
$key = join "", map chr(97+int(rand(26))), (1..3+int(rand(8)));
push @keys, $key;
}
# setup 'if'-code
my $code = "sub isMember_if {\n"
. join("", map "return 1 if \$_[0] eq '$_';\n", @keys)
. "return 0;}";
#print "$code\n";
eval $code;
# setup hash
my %hash = map { $_ => 1 } @keys;
sub isMember_hash { return $hash{$_[0]} };
use Benchmark 'cmpthese';
cmpthese( -1, {
'if' => 'isMember_if($key)',
'hash' => 'isMember_hash($key)'
});
On my machine, this prints (results vary slightly, depending on random keys being used)
Rate if hash
if 628278/s -- -68%
hash 1946613/s 210% --
Update: Actually, Perl's hashes are blazingly fast:
even for just one key/comparison ($nkeys = 1), the
if-approach still doesn't outperform the hash; and with key lengths
of more than about 3 chars, the hash lookup is even noticeably faster
(which seems to indicate that comparing two identical strings is
slower than computing the hashed value of that string...). | [reply] [d/l] [select] |
|
|
Perl never ceases to impress me. Also, shame on me for not doing my own benchmarks. I did know the if-approach would not scale well, but the OP seemed to be looking for a solution to a problem who's "search domain" was relatively small (3 elements in the example). But you have brought to light something I would never have expected.
| [reply] |
Re: Fastest way to test membership in a constant set
by Anonymous Monk on Jul 26, 2007 at 14:25 UTC
|
Okay - it seems like the closure method is the fastest at this time. But, can a use constant be used to define a constant set in Perl 5.6.1?
| [reply] |
|
|
constant, example 3.
See the update here.
-QM
--
Quantum Mechanics: The dreams stuff is made of
| [reply] |
|
|
| [reply] |
|
|
|
|