Re^2: MD5-based Unique Session ID Generator

Replies are listed 'Best First'.
Re^3: MD5-based Unique Session ID Generator by stvn (Monsignor) on Aug 19, 2004 at 15:29 UTC
Very true, but the double `md5_hex()` doesn't hurt (as far as I know). As I said, I am no crypto expert, and my knowledge of these things is limited. But I would think that hashing a reasonably unique string to produce a pretty darn close to unique string, and then hashing it again to get (what I would assume is) an even closer to truely unique string is a good thing when generating session ids. Please though, if I am wrong, and the double hash provides no benefit let me know why, as I would be interested in knowing. -stvn	[reply] [d/l]
Re^4: MD5-based Unique Session ID Generator by ctilmes (Vicar) on Aug 19, 2004 at 20:01 UTC
Double hashing without adding something else to it gains you nothing. If you get a collision with the first hash, you'll always get a collision with the second as well. double hash just costs you processing time.	[reply]
Re^5: MD5-based Unique Session ID Generator by ctilmes (Vicar) on Aug 20, 2004 at 12:12 UTC
I would even go as far as to say that double hashing can actually make more collisions. If you get a collision with the first hash, you absolutely will get a collision with the second hash, but if you don't get a collision with the first hash, you still have a chance of getting a collision with the second hash. say you start with X and Y. hash(X) = X' hash(Y) = Y' hash(X') = X'' hash(Y') = Y'' if X' = Y' (collision with first hash), then X'' = Y'' (collision with second hash) if X' != Y' (no collision with first hash), then X'' may = Y'' (possible collision with second hash)	[reply]
Re^4: MD5-based Unique Session ID Generator by radiantmatrix (Parson) on Aug 19, 2004 at 20:37 UTC
Please though, if I am wrong, and the double hash provides no benefit let me know why, as I would be interested in knowing. It doesn't help. Here's why: the `md5_hex` of a given value will always be the same. So, if `md5_hex("hey")` is always the same, then `md5_hex(md5_hex("hey"))`, while it will be a different digest than the first, will be consistently the same. Try it yourself. If the value for the first round of `md5_hex` isn't random, no amount of repetition will create a unique value. If you were using an encryption rather than a cryptographic digest algorithm, then the extra pass may help (depending on the algo.). HTH. (BTW: I'm not a crypto expert either, but I have done some amount of research trying to better understand it. If I'm Full O' Shite™, please tell me!)	[reply]
Re^4: MD5-based Unique Session ID Generator by pelagic (Priest) on Aug 19, 2004 at 20:56 UTC
As we don't want to exercise Cargo Cult let's see what's done in this snippet: `use strict; use Digest::MD5 qw/md5_hex/; for (1..10) { my $rand_id = time() . {} . rand() . $$; my $session_00 = md5_hex($rand_id); my $session_01 = substr (md5_hex($rand_id) , 0, 32); my $session_02 = substr(md5_hex(md5_hex($rand_id)), 0, 32); printf "%s\n%s %s %s\n\n", $rand_id, $session_00, $session_01, $se +ssion_02; }` [download] "$rand_id" is composed of a couple of items to generate uniqueness: "time()", "rand()" and "$$" are good for that while "{}", ref to an anonymous hash, doesnt help much, because it's always the same. It is possible to create more than 1 session id within 1 second but it's very unlikely to get more than 1 duplicate random within 1 second. So uniqueness is achieved. It's a good idea to hash the "readable" id to put it in a regular, non human readable string format. This hashing does not improve the "uniqueness" of the id. It makes it more difficult to be guessed or hacked but that's it! To hash it a second time doesn't do anything, nor good nor bad(besides performance). pelagic	[reply] [d/l]
Re^5: MD5-based Unique Session ID Generator by stvn (Monsignor) on Aug 19, 2004 at 22:11 UTC
As we don't want to exercise Cargo Cult.... Guilty as charged, and I thank you for pointing these details out. "time()", "rand()" and "$$" are good for that while "{}", ref to an anonymous hash, doesnt help much, because it's always the same. Actually, what you are seeing with the repeating "{}" value will not always be true. It seems (from my experimentation (look ma, no Cargo Cult)), is that it seems the repeating value you were seeing was something to the effect of perl's first memory location. So on each loop through the script you were seeing the location reaped and reused, and even when I forked each time within the loop, it did the same thing too. However, if you can be sure that this is not the first (?) ref created, you get a bit more randomness to that value. See the code below (spaces added for readability. my @rand; for (1..10) { # add a random number of elements to the array push @rand => $_ for (0 .. ((rand() * 10) % 10)); my $rand_id = time() . " " . { time => time() } . " " . rand() + . " " . $$; printf "%s\n", $rand_id; } __OUTPUT__ 1092953284 HASH(0x1806be4) 0.351068406456278 15758 1092953284 HASH(0x180820c) 0.581041221829715 15758 1092953284 HASH(0x1808230) 0.936157439122312 15758 1092953284 HASH(0x1808284) 0.183180004399297 15758 1092953284 HASH(0x18082c0) 0.943342015904591 15758 1092953284 HASH(0x1808338) 0.424439000654708 15758 1092953284 HASH(0x1808350) 0.935454533284215 15758 1092953284 HASH(0x180838c) 0.771976549032949 15758 1092953284 HASH(0x1808398) 0.549340888274884 15758 1092953284 HASH(0x18083e0) 0.984217993290265 15758 [download] Now of course, as the OP has pointed out to us, not all session generation is alike. This may not work for you if your script starts a fresh perl interpreter each time and the hash-ref always gets the same value. However, if you are in a long running process, this would seem to contribute to the initial entropy. -stvn	[reply] [d/l]