Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: 32bit/64bit hash function: Use perls internal hash function?

by haukex (Archbishop)
on Apr 08, 2022 at 09:49 UTC ( #11142825=note: print w/replies, xml ) Need Help??


in reply to 32bit/64bit hash function: Use perls internal hash function?

To answer your question directly, PERL_HASH is exposed by B:

$ perl -MB=hash -le 'print hash("foo")' 0xad6987c4 $ perl -MB=hash -le 'print hash("foo")' 0x5fd458f7

But as I said, you may want to consider using a standardized function, which when implemented in XS should be "nearly as fast". Also, keep in mind that in hashes, collisions are acceptable - short checksums will do that, and since you haven't told us anything about the problem you're trying to solve, we can't know if that is appropriate in your case.

Replies are listed 'Best First'.
Re^2: 32bit/64bit hash function: Use perls internal hash function?
by sectokia (Pilgrim) on Apr 10, 2022 at 10:59 UTC
    Sadly, B from hash() is useless for high speed stuff, because it appears to use a printf function to make its output string, which cripples its speed. Seems to be used just for debugging hash values, and not as interface to get a 32bit hash.
      it appears to use a printf function to make its output string

      That is true, but its implementation in XS is pretty simple and can be adapted to suit your needs. Given the performance constraints you've described, I think you're going to have to venture in the direction of C/XS anyway.

      Update: In fact, it turns out to be pretty easy!

      use warnings; use strict; use B 'hash'; use Inline C => <<'_C_'; U32 myhash(SV* sv) { STRLEN len; U32 hash = 0; const char *s = SvPVbyte(sv, len); PERL_HASH(hash, s, len); return hash; } _C_ print hash("foo"), "\n"; printf "%#x\n", myhash("foo"); __END__ 0x6611676e 0x6611676e
Re^2: 32bit/64bit hash function: Use perls internal hash function?
by sectokia (Pilgrim) on Apr 10, 2022 at 12:49 UTC
    Thanks! I am a complete noob when it comes to XS, need to really learn it.
      use strict; use warnings; use feature 'say'; use B 'hash'; use Crypt::xxHash 'xxhash3_64bits'; use Digest::xxH64 'xx64'; use Benchmark 'cmpthese'; use Inline C => <<'_C_'; U32 myhash(SV* sv) { STRLEN len; U32 hash = 0; const char *s = SvPVbyte(sv, len); PERL_HASH(hash, s, len); return hash; } _C_ srand 1234; my $s = pack 'C*', map rand 256, 1 .. 64; cmpthese -2, { hash => sub { my $x = hash( $s )}, myhash => sub { my $x = myhash( $s )}, xxhash => sub { my $x = xxhash3_64bits( $s, 0 )}, xx64 => sub { my $x = xx64( $s )}, }; __END__ Rate hash myhash xxhash xx64 hash 1944302/s -- -52% -54% -84% myhash 4088577/s 110% -- -3% -66% xxhash 4233986/s 118% 4% -- -65% xx64 11994386/s 517% 193% 183% -- This is perl 5, version 32, subversion 1 (v5.32.1) built for MSWin32-x +64-multi-thread

      Try xxHash? The Digest::xxH64 is not on CPAN (but linked to from home i.e. officially 'endorsed'(?):)), Crypt::xxHash needs a fix to install in Windows, and Digest::xxHash (not in example above) is slower and therefore perhaps not of much interest in context of 'B::hash is too slow'.

      As already mentioned, the Judy::HS provides both hashing and sparse storage already built-in under-the-hood. So maybe manually-done hashing is not what you need. I have 'played' (i.e. not in serious 'production') with Judy (but not with Judy::HS) to store and access huge sparse data, and, yes, speed is comparable to Perl hashes with significantly less RAM appetites.

      Another option to consider: Math::GSL::SparseMatrix (and GSL being solid and renowned, etc.). As above, I 'played' with 64-bit-addressed sparse single-row (or was it single-column?) vector. Slower than Judy, yet installs without hassle in Windows, theoretically can address 128-bit sparse space (because of 2D) and can store data shorter than 64-bit integers i.e. needs even less RAM in that case.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11142825]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others cooling their heels in the Monastery: (7)
As of 2023-12-01 16:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?