Re^12: On showing the weakness in the MD5 digest function and getting bitten by scalar context

Funnily enough, I don't have a beef. Nor have I advocated anyone should continue to use md5--nor that they should stop.

My original assessment of the significance of the "discovery" was that is was overstated. Since then, I have attempted to assess it's impact (for myself) by applying the information available about the discovery, to the various applications to which md5 is used.

I've tried to counter the FUD that anything that uses md5 is an immediate security risk leaving it's users immediately vulnerable. And counter the enevitable reactions from many users that they must immediately and irrevocably seek some alternative technology to secure their systems, websites and other applications.

I've tried to point out that it was always known that the multiple messages are possible for any given md5. And that for most applications, incorporationg the knowledge of that possibility, into the design of the application using md5, can entirely eliminate any consequences of that possibility becoming a reality.

I've tried to demonstrate that even those applications that chose to ignore the possibilty of "duplicates" in their design, the nature of the mechanism by which (so far) a duplicate has been generated, means that the content of the mathematically generated message is just an arbitrary collection of bytes.

The attacker using the message does not control that content. As such, any attempt to use such a generated message to attack an md5 based system is:

unlikely to go unnoticed.
unlikely to benefit the attacker in any meaningful way.
even in those applications (so far, one example described) in which the possibility of duplicates could have the effect of disrupting the application, that even simplistic steps can be taken to re-secure that application with only minor changes to the design of the application. And without requiring any "new technology" to do so.

Basically, I've attempted to apply a little calm logic to the situation and understand the actual implications of this discovery for myself, and those purposes to which I put md5. On the way, I've also tried to consider the exposure that the discovery leads to for many common applications of md5 as they have come up.

That's it. Drawing my own conclusions, for my own existing uses of md5, and attempting to contribute to a discussion about the likely impact of the discovery for other, more important, existing uses of md5.

Just as double-DES means that the known mathematical attack against a single DES system must be applied to 2 DES keys simulataneously and repeated until a compromise that matches both is found. And the multiplication factor of the work that must be done by the attacker means that even if that mathematical attack can be successfully applied to two keys simultaneously (which is still extremely unlikely, though not proven impossible), it will take a very, very, long time. Using triple-DES makes the attack not just extremely unlikely to succeed, but so laborious as to be a total waste of time trying.

Equally, using two nested md5's means that for an attack to be effective, not only must the attacker apply the mathematics to both keys to find duplicate messages. They must do it such that the math produces a single message that (in whole or part), matches both keys. In my less than authoratative opinion, that is as close to impossible as I will ever need to consider.

It is my conclusion (drawn for my own purposes), that if the algorithms using md5 accepted the reality that somewhere, sometime, a duplicate will come up, and incorporate that reality into their design, then the affect of that reality can be completely negated. And any application, especially a security application that ignores the congectural nature of the uniqueness factor of md5s, and relies upon that uniqueness, is broken--not by this discovery, but by design.

Others must reach their own conclusions, based on their knowledge of their uses of md5, and the urgency of maintaining their security.

Examine what is said, not who speaks.

"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

Comment on Re^12: On showing the weakness in the MD5 digest function and getting bitten by scalar context

Replies are listed 'Best First'.
Re^12.5: On showing the weakness in the MD5 digest function and getting bitten by scalar context by Anonymous Monk on Aug 30, 2004 at 17:00 UTC
And any application, especially a security application that ignores the congectural nature of the uniqueness factor of md5s, and relies upon that uniqueness, is broken--not by this discovery, but by design. Everything is conjecture, I'm afraid. If you're holding out for absolute proof of security, you will be waiting a long time. What assurances we have come from mathematical proofs that assume the existence of a collision-free hash function(1) and reason from there. Since someone has found a way to generate collisions, that makes the proofs useless for MD5. You are correct in pointing out that the collisions that are generated take a particular form, and that form may or may not expose an actual vulnerability in real cryptographic protocols. We could try to prove that no vulnerability exists, but the proofs would become fiendishly difficult, and not everyone would have confidence in the ability of mathematicians to get them right. The prudent course of action is to switch to a hash function for which the original conjecture still holds. (1) Collision-free is a technical term meaning collisions are hard to find, not that they don't exist. Some proofs don't need the collision-free property, so I guess they're still safe. Void where taxed, licensed, or restricted. Professional driver -- do not attempt. Others must reach their own conclusions, based on their knowledge of their uses of md5... Another interesting thing about cryptography is that everyone thinks they're an expert.	[reply]
Re: Re^12.5: On showing the weakness in the MD5 digest function and getting bitten by scalar context by BrowserUk (Patriarch) on Aug 30, 2004 at 19:41 UTC
I can't (nor would I try) to dispute the math, but I do wonder about your conclusion: The prudent course of action is to switch to a hash function for which the original conjecture still holds. Given that it is sheer scale, that is the basis of these hashing algorithm's utility, it is almost non-sequitous to consider proving them. The very thing that prevents them from being trivially cracked through brute force, is the same thing that prevents them from being rigorously proved by that same method. Mathematicians can construct proofs (that are way (way, way) over my head) for seemingly much more complex algorithms than these. Many such proofs have later been shown to be false, in the light of further analysis, years or even decades later. Anyone who's read ISBN 1-85702-699-1 know's this to be so. Any new algorithm is just as likely to be weak in the same respect as md5. Except it could be that those that discover the weakness of a new algorithm are not so publically spirited as to announce their discovery to the entire world. To me, (a self described non-expert), it seems it would make more sense to base one's security upon protocols that acknowledge that hashing algorithms do produce duplicates and factor that into the overall protocol. It also makes sense to use the combinatorial effect of multiple passes of the same (or different) weak hashes to produce a much harder target for the mathematical attack to aim for. Admittedly, these can be even harder to prove, but in that lies a little reassurance that the are also harder to attack. It also seems that it would be better to analyse the method of attack, and use it's properties to devise protocols that specifically counter that attack, than to surrender that hard won knowledge in favour of another, equally unverified and unverifiable algorithm. Another interesting thing about cryptography is that everyone thinks they're an expert. S'funny, but had you said that about security, I would have been in complete agreement. The world, or at least the internet, seems to be full of, usually self proclaimed, "security experts". In my experience, there is nothing more dubious than those that need to "claim" expertise. If your a student of the history of cryptography, you'll know that most of the best cryptographers have been hobbiest and enthusiasts, though many were mathematicians first and foremost. As for myself, it is yet another of those subjects that I am facinated by, but claim absolutely zero expertise in. Not in this thread, nor any other will you see me claim expertise. I have some experience in a range of different computer related fields. And I've worked with, and know, some genuine experts in several. Nothing more. I am reminded of (one of) Pournelle's Laws: If you don't know what you're doing, make sure you know someone who does. Examine what is said, not who speaks. "Efficiency is intelligent laziness." -David Dunham "Think for yourself!" - Abigail "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon	[reply]