I'll read the rest again and try stuff in the morning... In the meantime:
And that brings me back to your collisions graphs. I think you got your math wrong. I simply cannot believe that you should expect 1890 collisions from 16384 randomly chosen samples from a domain of 2**32 possibilities. Birthday paradox or not, that is way, way too high a collision rate. Way too high.
The 1890 collisions is when mungeing the 32-bit hashes into 16-bit hashes -- so it's 16,364 random thingies in 65,536 bins... So roughly speaking:
after 4,096 tosses ~1/16 of the bins are occupied...
...so the next 4,096 tosses ~4,096 * 1/16 = 256 collisions, and ~ 2/16 of the bins are occupied...
...so the next 4,096 tosses ~4,096 * 2/16 = 512 collisions, and ~ 3/16 of the bins are occupied...
...so the next 4,096 tosses ~4,096 * 3/16 = 768 collisions...
...total 1,536 -- accepting that this is an underestimate.
In reply to Re^7: 64-bit digest algorithms
by gone2015
in thread 64-bit digest algorithms
by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |