in reply to Re^2: How good is gzip data as digest?
in thread How good is gzip data as digest?

isync:

Okay, then you could always amortize the disk lookup by using a fragment of the digest value as a hash key to keep the size small, then you'd need only reference the disk when you have a collision. It's yet another level of crunching, but might save you enough RAM and speed to get the performance you need.

HOWEVER: Have you actually measured the performance? It would be a pity to waste all this time thinking about it if the disk-based hash would be, in fact, fast enough to serve the purpose.

Remember: First make it work, then make it fast...

roboticus

Replies are listed 'Best First'.
Re^4: How good is gzip data as digest?
by isync (Hermit) on Apr 02, 2009 at 21:24 UTC
    Actually I did measure performance, and it's significant. But I like your idea of hash-fractions - will be in the next iteration of my lookup-hash algorithm for a test-drive.