use strict;
use warnings;
use feature 'say';
use B 'hash';
use Crypt::xxHash 'xxhash3_64bits';
use Digest::xxH64 'xx64';
use Benchmark 'cmpthese';
use Inline C => <<'_C_';
U32 myhash(SV* sv) {
STRLEN len;
U32 hash = 0;
const char *s = SvPVbyte(sv, len);
PERL_HASH(hash, s, len);
return hash;
}
_C_
srand 1234;
my $s = pack 'C*', map rand 256, 1 .. 64;
cmpthese -2, {
hash => sub { my $x = hash( $s )},
myhash => sub { my $x = myhash( $s )},
xxhash => sub { my $x = xxhash3_64bits( $s, 0 )},
xx64 => sub { my $x = xx64( $s )},
};
__END__
Rate hash myhash xxhash xx64
hash 1944302/s -- -52% -54% -84%
myhash 4088577/s 110% -- -3% -66%
xxhash 4233986/s 118% 4% -- -65%
xx64 11994386/s 517% 193% 183% --
This is perl 5, version 32, subversion 1 (v5.32.1) built for MSWin32-x
+64-multi-thread
Try xxHash? The Digest::xxH64 is not on CPAN (but linked to from home i.e. officially 'endorsed'(?):)), Crypt::xxHash needs a fix to install in Windows, and Digest::xxHash (not in example above) is slower and therefore perhaps not of much interest in context of 'B::hash is too slow'.
As already mentioned, the Judy::HS provides both hashing and sparse storage already built-in under-the-hood. So maybe manually-done hashing is not what you need. I have 'played' (i.e. not in serious 'production') with Judy (but not with Judy::HS) to store and access huge sparse data, and, yes, speed is comparable to Perl hashes with significantly less RAM appetites.
Another option to consider: Math::GSL::SparseMatrix (and GSL being solid and renowned, etc.). As above, I 'played' with 64-bit-addressed sparse single-row (or was it single-column?) vector. Slower than Judy, yet installs without hassle in Windows, theoretically can address 128-bit sparse space (because of 2D) and can store data shorter than 64-bit integers i.e. needs even less RAM in that case.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.