Although I'm certain that this approach works, and will continue to work, MD5 sums are not unique for every file. If they were, this would be the ultimate compression algorithm (that is, if the MD5 were unique, you could use it to reverse engineer the file using only the hash because each hash have only one possible antecedent). The odds of two similar files having the same MD5 sum, however, is very low.
Comment on Re: Re: Using MD5 and the theory behind it
Using one of
these approximations,
it looks like the probability of a birthday collision
will finally hit 0.5 by about the time mr.nick has processed
his
22 million million millionth MP3, so
I'd agree that he has nothing to worry about for now. ;)