Re: Randomizing Unique ID?
by ar0n (Priest) on Aug 09, 2000 at 16:32 UTC
|
Considering the fact of me being lazy, I'd do something like this (just to save typing space):
use Digest::MD5 qw(md5_hex);
use Time::HiRes qw(time);
my $id = md5_hex time();
-- ar0n | Just Another Perl Joe
| [reply] [Watch: Dir/Any] [d/l] |
|
In practice, I prepend a string ($msg = "vcmi$msg") before
I encrypt. Then, when I'm given one of my unique id's back
I can verify that it hasn't been fake'd by decrypting it
and checking for the /vcmi/ at the beginning.
Anyhoo, that's why I use'd blowfish.
| [reply] [Watch: Dir/Any] |
Re: Randomizing Unique ID?
by jettero (Monsignor) on Aug 09, 2000 at 15:36 UTC
|
#!/usr/bin/perl
use Time::HiRes qw /time/;
use Crypt::Blowfish;
$unique = &ecry(time); # the_line
print "Unique ID: $unique\n";
sub ecry() {
my ($e_msg, $m);
my $msg = shift;
my $block;
my $cipher = new Crypt::Blowfish pack("H32", "0123456789abcdef");
$msg = time . "$msg";
while($msg =~ m/(.{1,8})/g) {
$m = $1;
$m = $m . " " while length($m) < 8;
$block = $cipher->encrypt($m);
$e_msg = $e_msg . $block;
}
return unpack("H*", $e_msg);
}
update: tilly suggests using some combination
of time and Math::TrulyRandom on the_line (above). | [reply] [Watch: Dir/Any] [d/l] |
RE: Randomizing Unique ID?
by Adam (Vicar) on Aug 09, 2000 at 21:24 UTC
|
Any algorithm using "random" numbers risks repition. What you want is a unique number. Something like $$ . time (Thats the process ID concatenated with the epoch.) If you want it to use letters and such, change the base after generating the number. Here is an algorithm for that: # For a positive $number
my @nums = (0..9,'a'..'z','A'..'Z');
my $base = @nums;
my $rep = ""; # this will be the end value.
while( $number > 0 )
{
$rep = $nums[$number % $base] . $rep;
$number = int( $number / $base );
}
(Thanks Tye for looking over my shoulder and helping construct that simple algorithm) | [reply] [Watch: Dir/Any] [d/l] |
|
Note that this is barely adequate since a "typical"
double has 53 bits of mantissa (not counting
the sign bit) which can handle just under 16 decimal digits
while time() currently returns 9 digits but will soon return
10, while $$ is typically no more than 5
digits (total will soon be 15 digits).
Doing the conversion to base-62 in two steps would give
you lots of room (and different results).
-
tye
(but my friends call me "Tye")
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
Well, If you want more room in the double, then do the base conversion before doing the concatenation, thusly:
# Generate a truly unique ID
sub GenerateBase
{
my $base = shift;
$base = 62 if $base > 62;
my @nums = (0..9,'a'..'z','A'..'Z')[0..$base-1];
return sub
{
my $number = shift;
my $rep = ""; # this will be the end value.
while( $number > 0 )
{
$rep = $nums[$number % $base] . $rep;
$number = int( $number / $base );
}
return $rep;
}
}
my $ToBase62 = GenerateBase( 62 );
my $ID = $ToBase62->( $$ ) . $ToBase62->( time );
| [reply] [Watch: Dir/Any] [d/l] |
|
However, the risk of repetition is more than outweighed by the
new risk of predictability, especially using time and a process ID (yes,
some smarter OS's have non-sequential PIDs). I do not think the issue
of repetition is much of a risk. In the example the original poster
used, the chance of a repetition is about
1 in 10^31. I'll take those odds any day. :)
| [reply] [Watch: Dir/Any] |
|
The rand() solutions are nearly as predictable,
since rand() is deterministic and the latest Perl's seed
function is deterministic based on $$, time(), and
PL_stack_sp (except on some platforms).
That last value, at the time it is checked, is probably
often constant across invocations of the same script.
The odds are much worse than 1/10^31 as
the entire sequence from rand() is deterministic based on
the seed. Perhaps you meant 1/2^32, which presumes a
completely random seed, which you don't have.
If you want to avoid predictability, then use
Math::TrulyRandom (or encrypt using a secret key, if you
think you can keep it secret) as others suggested.
At least Perl has a much better seed generator these
days.
Also, even 1/2^31 would be the odds of a single duplicate.
If you have lots of IDs with overlapping lifetimes, then
the odds rise surprising fast. If you rarely have more than
a few IDs active at once, then 1/2^31 might well be
acceptable. But if it is possible, for example, to have
6600 simultaneous IDs, you get about a 1% chance
of a collisions per batch of IDs; which probably isn't
acceptable.
-
tye
(but my friends call me "Tye")
| [reply] [Watch: Dir/Any] |
|
|
|
You can have the best of both worlds. Use a combination of randomness and uniqueness to make the output unique. Something like, (and I am building on the code in RE: RE: RE: Randomizing Unique ID? ):
my $ID = $ToBase62->( $$ ) . $ToBase62->( rand( $$ ) ) . $ToBase62->(
+time );
Not forgeting to call srand() at some point. And of course, you can use whatever rand() you want.
Update: Don't bother seeding. Tye explains the internal seed here. Also, see the Base Conversion Utility. | [reply] [Watch: Dir/Any] [d/l] |
Re: Randomizing Unique ID?
by BlaisePascal (Monk) on Aug 09, 2000 at 15:42 UTC
|
I dunno, would something like:
my $id = "";
for (1..18) { $id .= ('A'..'Z','a'..'z','0'..'9')[int(rand(62))];}
work for you? | [reply] [Watch: Dir/Any] [d/l] |
|
Nice solution.
Downsides:
- This recreates the 62 element list repeatedly in memory, since there's
no place to store it.
- The number "62" is hardcoded, and must match the list exactly. That's
a maintenance headache: suppose a requirement came along to ensure that
underscore were included, or that digits were not.
- The int is redundant.
Implementing these three changes, I'd go with something like:
BEGIN {
my @source = ('A'..'Z', 'a'..'z', '0'..'9');
sub generate_random_password {
my $id = "";
for (1..18) {
$id .= $source[rand @source];
}
$id;
}
}
For so-called "pronounceable" passwords, try Crypt::RandPasswd. Looks nice, and based on a government standard.
-- Randal L. Schwartz, Perl hacker | [reply] [Watch: Dir/Any] [d/l] |
|
About your downsides...
1) Yup, I can see capturing the 62-element in a closure instead of re-creating it. I didn't think of that when I was coding off the top of my head.
2) Nice catch
3) Oops... I missed that. I looked in the library to verify the result of rand() and it said it returns a float, not an int. But I guess the [ takes care of that...
Since this is supposed to be a "UniqueID" and not a password, using "pronounceable" passwords are probably not necessary.
Why put it in a BEGIN block? Doesn't seem necessary to me.
| [reply] [Watch: Dir/Any] |
|
Re: Randomizing Unique ID?
by turnstep (Parson) on Aug 09, 2000 at 16:35 UTC
|
my @letters = ('A'..'Z','a'..'z',0..9,qw(! # $ % ^ | _));
my $UID = join("", @letters[map {rand @letters} (1..8)]);
I don't use the characters @ or & because it is sometimes
passed through a cgi and I don't want to make things
more confusing than they have to be. By the way, 8 characters should be plenty, as there are about
69^8 possible combinations. Or without the funky
characters, 62^8, which is 218340105584896 possible
unique IDs. Or get real paranoid and bump it up to
16 characters. :)
| [reply] [Watch: Dir/Any] [d/l] |
Re: Randomizing Unique ID?
by Martin A (Beadle) on Aug 09, 2000 at 19:57 UTC
|
Thanx everybody.
I Think I'll have a go at turnstep's or BlaisePascal's solution. Not that I don't think that the others are good to, but the thing is is that the sucky server that i use doesn't allow me to use any modules. :(
But if there is any other monks with other (non module based) solutions, I am still open for new ideas.
// Martin | [reply] [Watch: Dir/Any] |
|
Note that both of these have a risk of collision that
cannot be reduced below 1/2^(number of actually random bits
in the seed), no matter how many characters you decide to
use in the session ID. This is perhaps a small risk in
this particular case, but I thought the general problem
should be mentioned.
On old versions of Perl, srand()
isn't even called so that each solution will always produce
the exact same ID.
Adam's is a better idea and nearly as simple. It only
requires about 5 characters of ID. You can't have a
collision unless
32000 processes are created within a single
second on your server. If you have multiple computers
doing sophisticated load balancing of web requests such
that IDs need to be unique across machines, then you'll
need to append some characters that are unique to a
machine (what to use depends on how your load balancing
is done).
-
tye
(but my friends call me "Tye")
| [reply] [Watch: Dir/Any] |