Although I'm certain that this approach works, and will continue to work, MD5 sums are not unique for every file. If they were, this would be the ultimate compression algorithm (that is, if the MD5 were unique, you could use it to reverse engineer the file using only the hash because each hash have only one possible antecedent). The odds of two similar files having the same MD5 sum, however, is very low.