Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: What is MD5 Hashing and Why is it Important?

by thraxil (Prior)
on Feb 15, 2002 at 16:04 UTC ( [id://145707]=note: print w/replies, xml ) Need Help??


in reply to What is MD5 Hashing and Why is it Important?

MD5 (message digest algorithm) or other hashes are one way functions that produce a "fingerprint". essentially, they map something with a lot of bits down to just a few bits (128 in the case of MD5) in such a way that collisions are as rare as possible.

this is useful because you can compare and store these small hashes much more easily than the entire original sequences. in cryptography, one-way hashes are used to verify something without necessarily giving away the original information. eg, unix stores hashes of passwords instead of the passwords themselves. when a user enters their password, the system computes the hash of it and compares it to the hashes listed in /etc/passwd. since you can't run the hash function in reverse, the system knows that the password you entered is the right one. the crypt that unix uses doesn't really reduce the size but is a similar idea. hashes and digests like MD5 are an integral part of digital signatures.

MD5 is just one of many hashing algorithms. it was developed by Ronald Rivest (the R in RSA) in 1991. it's an updated version of MD4. there's also SHA-1 (Secure Hash Algorithm), developed by NIST in conjunction with the NSA and produces 160 bit digest. it's more collision resistant than MD5 and is the most commonly used hash for cryptographic purposes these days. there are others, but these are the important ones.

anders pearson

Replies are listed 'Best First'.
Re: Re: What is MD5 Hashing and Why is it Important?
by demerphq (Chancellor) on Feb 15, 2002 at 16:42 UTC
    If I recall correctly a well designed hashing algorithm should cause approximately 50% of it bits to change when 1 bit of its input is different.

    But perhaps you can tell me, if SHA1 has less collisions than MD5 why do I see MD5 used more often than SHA1? (In non cryptographic areas anyway) Is MD5 more efficient to generate?

    Yves / DeMerphq
    --
    When to use Prototypes?

      Part of the reason why MD5 is still around is because it's so common. It does have a greater collission risk than SHA1 and this makes it more vulnerable (I'll explain below). However, it is quicker to generate an MD5 digest than SHA1. If you're forced to generate many digests, you'll prefer MD5.

      The reason why these hashing algorithms are so slow is because they were designed to be slow. Consider what happens when a cracker gets your /etc/passwd file (assuming you don't use /etc/shadow). Each entry will have the password hashed and that will resemble the following:

      $1$1PUXLuZE$P.LfclRO9SKqTf2BQK.yD1

      The 1PUXLuZE is the salt. With a crack program, you use the salt with a list of likely passwords to try to recreate LfclRO9SKqTf2BQK.yD1. If you do, you have the password. If there is a collission (more than one password will generate that string), then security is tremendously weakened.

      Now, if most users have a password like F&832*,--?, those probably aren't going to get cracker. However, someone is going to violate your password policy and fail to understand how p4$$w0rd1 was cracked. If the cracker is running crack, though, they could easily run the program for a week before getting to the insecure password. But, if you have collissions, this time could be reduced significantly. SHA1 avoids this vulnerability and also takes longer to compute.

      As of a month and a half ago, I didn't know any of this. I only learned when I asked for feedback on my CGI course and mdillon replied with this node.

      Cheers,
      Ovid

      Update: Read the follow-ups to this post!

      Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

        erm... i'm pretty sure that none of the hashing algorithms were designed to be slow. if someone came up with an algorithm that provided exactly the same security as SHA-1 but shaved an order of magnitude off the calculation time, that would be used instead. the security is in infeasibility of coming up with two different inputs that map to exactly the same digest. this is ruled by the size of the digest (the probability of a collision is 1 in 2^128 for an ideal 128 bit digest and 1 in 2^160 for an ideal 160 bit digest). those are both pretty much astronomically insignificant. the 160 bit key is of course inherently less likely to produce a collision but 128 is still pretty good. compared to those probabilities, actual execution time of the calculations are insignificant. since the computers doing the hashing legitimately might have to be doing a lot of them, you want them as efficient as you can (this also might to have to be done on things like smartcards and low-power portable devices where CPU speed isn't as abundant.

        MD5's problem is that it's far enough from the ideal 1 in 2^128 that cryptographers (who are orders of magnitude more paranoid than civilians) get a little nervous.

        anders pearson

      basically. it's much faster to compute the 128 bit digest for md5 than the 160 bit digest for SHA-1. since MD5 is good enough for non-cryptographic purposes, the speed advantage makes it a better choice in most cases.

      i don't remember exactly what the weakness was in MD5 but at some point they found that with certain kinds of input MD5 was more likely to produce collisions than it should. the situations that produced this behaviour are pretty rare normally but could be taken advantage of to weaken it in cryptographic applications.

      if you want to know more about the specifics of the weakness, read these:

      M.J.B. Robshaw, On Recent Results for MD2, MD4 and MD5, RSA Laboratories Bulletin 4 (pdf) (November 1996).

      B. den Boer and A. Bosselaers, Collisions for the compression function of MD5, Advances in Cryptology - Eurocrypt '93, Springer-Verlag (1994), 293-304.

      H. Dobbertin, The Status of MD5 After a Recent Attack, CryptoBytes (2) 2 (Summer 1996).

      anders pearson

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://145707]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2024-03-28 20:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found