in reply to Re^3: 64-bit digest algorithms
in thread 64-bit digest algorithms

determining whether there is any coupling between the pairs of bits that would be xor'd together

I read in that same text that in a good digest function, each bit of the input should affect approximately half of the bits of the output. (I think) the implication was that if this ideal is achieved, and the distribution of the affected bits for each bit of input is sufficiently unpredictable, then the digest is 'secure' meaning unpredictable. But the problem with many of the digests that have been shown to be weak, is that the ideal is not achieved, making certain patterns of outputs for given inputs discernable.

I read that to mean that there is coupling between the upper and lower halves of any digest. Good or bad?


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^5: 64-bit digest algorithms
by GrandFather (Saint) on Nov 13, 2008 at 08:15 UTC

    Coupling between halves is of less concern than if there is coupling between the bits that are xored together. My understanding of the MD5 algorithm suggests that coupling would be low and that xoring the top and bottom halves of an MD5 should retain the distribution characteristics of the non-combined digest (scaled to the smaller range of course). What's the application?


    Perl reduces RSI - it saves typing

      Once I made the leap from 'digest' to 'hash', my searching has been more effective, and I finally turned up (one of) the references I read previously that leads to my concerns regarding the truncation of a larger digest algorithm.

      In particular, see the section headed "Funneling" and the paragraph that reads "For example, consider XORhash and 30-byte keys. All 30 lowest-order key bits funnel into only the lowest-order bit of the internal state. Every set of a billion (2^30) keys, which differ only in the lowest order key bits, maps into just 2 hash values, even though 101 hash values are available."

      Another reference that I haven't yet tracked down, but may be linked indirectly from the above, suggested that testing had shown that truncation (of any form) of MD5 resulted in poor distribution. I don't have the knowledge to either challenge or verify that assertion.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        If I understand your intent correctly you have some message text that you wish to hash to 64 bits. The funnelling weakness is a potential issue for the hash algorithm, but is not a concern for the one off (per message) mixing of, say, a 128 bit MD5 digest down to 64 bits. By using an xor mix of the two 64 bit fields you are retaining the entropy and distribution characteristics present in the individual fields so long as there isn't a problem with coupling between bits (and for MD5 that seems unlikely).

        The sort of code I had in mind would be something like:

        use strict; use warnings; use Digest::MD5; my $msg = 'The quick brown fox jumps over the lazy dog.'; my $digest = Digest::MD5::md5_hex ($msg); $digest = substr ($digest, 0, 8) ^ substr ($digest, 8, 8); print unpack ('H16', $digest), "\n";

        Perl reduces RSI - it saves typing