in reply to Hashing urls with Adler32
A final word. Don't use Alder32 (alone) for this purpose!
Running Adler, CRC32 and both on several sets of 1 million randomly generated url-like strings ranging from 16 to 128 characters in length, Adler produced duplicates in ~1% of cases; CRC32 produced ~0.2%; and in several runs the combination of both found just 2 duplicates (circa. 0.002% but not enough samples to be judged representative).
|
|---|