in reply to Re^5: 64-bit digest algorithms
in thread 64-bit digest algorithms
This is a follow-up/in addition to my previous reply. I finally got around to adapting some code I wrote to assess md5 to access CRC32, The result is this image
This image plots the state affect of changing bits in a randomly selected 128-bit input to CRC32 (plotted vertically), against
Although, the number of trials is in effect irrelevant as it doesn't vary (at all) from the state reached after just 1000 trials.
Though as will be explained, the results of this are minimal as the results are very "black & white" (a pun which will also be explained.)
So, for each randomly chosen input, the CRC is calculated. Then, each bit of the input in turn is toggled, the CRC32 recalculated, and the differences between the two CRCs is are counted and accumulated for the number of trials. And the chi-squared is applied to emphasis the differences from the expected averages.
Thus, for each pixel, a mid-grey color (matching the border, represents the expected 50% of the pixels change ideal. A pure black pixel indicates that that pixel never changed. A pure white pixel indicates that that pixel always changed. And, as I said above, the results are very black & white. And they indicate that CRC32 makes for a very poor hashing algorithm, because with some columns are entirely white regardless of the input, and other entirely black, it means that some hash values will be way over used and other never.
The mid-grey and black vertical strips on the right hand third of the image is the result of performing the same processing upon a 16-bit XOR-fold of the CRC32s produced above. The absence of white would seem to indicate that this has improved the result, but the presence of pure black just goes to show that you cannot generate bits using XOR, only dilute them. Upshot: XOR folding is bad for hashes. Truncation is okay because it uses the bits available without diluting them.
And that brings me back to your collisions graphs. I think you got your math wrong. I simply cannot believe that you should expect 1890 collisions from 16384 randomly chosen samples from a domain of 2**32 possibilities. Birthday paradox or not, that is way, way too high a collision rate. Way too high.
Hmm. I uploaded the image and the size displayed on the update page conformed it was seen, but for 20 minutes now, it fails to be rendered back to my browser. So you may not be able to see it.
However, if you download the code below, save it as CRCtest.pl and run it with the command line: perl -s CRC32test.pl -TRIALS=1e5 -MAP=BW, it will be reproduced for you locally, though you will probably need to arrange to load it into an image app yourself (if you use *nix), and then stretch it a little to see it clearly. Updated to pack the CRC. Image updated.
Note:There are various other command line arguments available, including color ramping. If you cannot work out how to use them, ask.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^7: 64-bit digest algorithms
by gone2015 (Deacon) on Nov 22, 2008 at 03:05 UTC |