in reply to Use time() to create unique ID

Instead of using time to generate your unique ids how about a derivative of rand()? Such as:

use strict; use constant id_length => 128; foreach (0..3) { printf "%s\n",genUniqueID(); } exit(0); sub genUniqueID { my @alpha=('A'..'Z','a'..'z','0'..'9'); my $buffer=""; foreach (1.. id_length) { $buffer = $buffer . $alpha[rand($#alpha+1)]; } return $buffer; } __END__
A sample run produces:
--$ perl unique_id.pl vNJaCW91KaRqtGuVKffRY1ufjOWN8O09h8C2QL28mdNWoR +fuLVBawYxWuDLC6L2q2LYPoyyiit6L7jb9OYP3ZbU4Jdf9A1pQMwOBppsEpVEg5HdCijL +GlvPSMDe14ANL 8W6voRR5r1B2zai2aUEYRfC2tXtJKoI1jU0J9gWP7hXdrMV8oQ6qTbQa3B9U6ebc99eOM8 +TeNacHwUuvFmakIYCWqIfrwjwE01bhxhGcfHKJOcbapt6fRWqhoalTzutb GDqdFLUCWe1pichxfUFdQybLLmzsFjdFC2baq7Ec12ftGp6sckbvKrbeGmdt5wj7HYuQ5B +nOJQB5eGERsWiMolfHm4f7xYFf6UVfENLhyEn2CNOp55Wh1sajtq6ZOV3T I4V3YVHak0pKAwN0V4rLdvAXFRqz1lSCZ9LnDHdZLbPLDQrzd1dJx5iFCXqH4GrEaMgB05 +DzMUYSTW00y6neDrGOWVphi1xZ2PMxrDilKTJCxBkB5P8oegJCCeI43FpN --$

Please note that you can modify the length of the unique key by changing the value in the use constant statement to be whatever you want. However the longer the key the more likely it is to be unique.


Peter L. Berghold -- Unix Professional
Peter at Berghold dot Net
   Dog trainer, dog agility exhibitor, brewer of fine Belgian style ales. Happiness is a warm, tired, contented dog curled up at your side and a good Belgian ale in your chalice.

Replies are listed 'Best First'.
Re: Re: Use time() to create unique ID
by davido (Cardinal) on Sep 16, 2003 at 17:27 UTC
    Rand offers the high probability of scarcity in a finite dataset, but it doesn't guarantee uniqueness. And in an infinate dataset (which is, of course, only theoretical) using rand will result in an infinate number of duplications. Even in a small dataset, though highly improbable, there is no guarantee that rand wont return .984553 followed by .984553 within a few iterations. It's within the realm of possibility, even if unlikely.

    If you don't want to use the unique user id module, perhaps you could use a combination of: Process ID, Time, and an in-loop counter. If you want to rely on rand, don't call the ID "unique". Call it "rare". Just because something is improbable doesn't mean it's unique. And why settle for scarcity in a situation where you require uniqueness, when it is truly not that difficult to develop a solution that provides what is actually needed?

    Dave

    "If I had my life to do over again, I'd be a plumber." -- Albert Einstein

          Rand guarantees scarcity in a finite dataset, but it doesn't guarantee uniqueness.

      Let's not get hung up in the difference between the practical and the theoretical here. :-)

      For the purposes stated by the OP using rand() is "good enough." Also based on my own practical use using this method to generate unique session ids for web transactions I have found that it works very well.

      When using this method in my own applications I have very deliberately set up trapping logic checking to make sure that a generated session id is not already in use and if it ever happens the logic logs the incident. The log is still empty for one application I use it for and that web application was installed in August of 2001. Over two years now and no collisions. I think that works pretty darn good.

      Truly random and unique ids

      The one time I needed to generate truly random numbers for an application I wrote (I could tell you what it was but then I'd have to shoot you) :) I decided the best way to do it was taking a page from PGP and GNUpg and use system entropy. Stuff like watching the position of the system disk heads, being influenced by system interrupts (mouse, keyboard, etc.) and stuff like that.

      You can make yourself nuts with the whole subject and folks a lot smarter than me have made their academic mark on the world writing papers on the subject and there is even a whole field science dedicated to the subject. For practical purposes you have to make a decision as to what constitutes "random enough" and code accordingly. A random ID of 128 characters is probably going to be random enough for 99% of the uses out there .

      But then... we are getting way off topic here...


      Peter L. Berghold -- Unix Professional
      Peter at Berghold dot Net
         Dog trainer, dog agility exhibitor, brewer of fine Belgian style ales. Happiness is a warm, tired, contented dog curled up at your side and a good Belgian ale in your chalice.
        Also based on my own practical use using this method to generate unique session ids for web transactions I have found that it works very well.

        There is a big difference between "works very well" and hasn't broken yet. Using a random number for a unique ID is akin to adding a known but rarely encountered bug. It is a terrible solution to a problem which has known good solutions.

        -sauoq
        "My two cents aren't worth a dime.";
        
Re: Re: Use time() to create unique ID
by sauoq (Abbot) on Sep 16, 2003 at 23:22 UTC

    You may be fooling yourself with your ID length of 128. On many platforms, the limiting factor is the size of the seed, which is often 32 bits. Running continuously, your code will produce duplicate keys after producing about 2^32/128 of them. (And there is no guarantee that you wouldn't produce a duplicate before that.) On subsequent runs, you have essentially the same problem. Random numbers would be bad enough. Pseudo random numbers create additional problems.

    -sauoq
    "My two cents aren't worth a dime.";
    

          You may be fooling yourself with your ID length of 128. On many platforms, the limiting factor is the size of the seed, which is often 32 bits.
      If you are attempting to start an argument with me, you failed. I agree with you. This is where some very serious testing needs to be done on any solution where you are attempting to produce unique keys of any sort. Especially where real "randomness" is required. Hence why elsewhere in this thread I make reference to using system entropy as they do in PGP and GNUpg and other cryptographic products.

      In 25 years of programming I have yet to see a truly random random number generator over a sufficiently large data set without the use of some external influence on the numbers being generated.

      This of course gets back to the basic premise that using time() and friends to produce the id may not be very ideal even for the simplest of application.

      However, I stand by my opinion that the degree of randomness you need is part of the design criteria you need to develop in your program specification and the criticality it has in relationship to the program you are writing and the data or transactions you are trying to protect. If I am generating unique IDS for sessions dealing with a guest book application (OK... so I'm exaggerating) then I am not going to worry too much about how random the key generation is. On the other hand if I am protecting national security data where lives are on the line then I am going to look to somebody like the NSA for guidance as to what the "latest and greatest" crypto algorithm is.


      Peter L. Berghold -- Unix Professional
      Peter at Berghold dot Net
         Dog trainer, dog agility exhibitor, brewer of fine Belgian style ales. Happiness is a warm, tired, contented dog curled up at your side and a good Belgian ale in your chalice.
        If you are attempting to start an argument with me, you failed. I agree with you.

        I wasn't "attempting to start an argument" with anyone. I was pointing out a conceptual misunderstanding that you displayed both with your code and with the assertion you made in the post I was replying to: "However the longer the key the more likely it is to be unique."

        As your code uses a pseudorandom generator, the length of the "key" isn't the limiting factor... the length of the seed is.

        Hence why elsewhere in this thread I make reference to using system entropy as they do in PGP and GNUpg and other cryptographic products.

        And, elsewhere, I point out that relying on randomness for creating unique identifiers is a poor approach altogether.

        There are two points to be made here. 1) Using real random numbers rather than pseudorandom numbers doesn't fix the problem with relying on randomness for generating unique IDs. 2) Using a longer "key" length is not at all guaranteed to reduce the number of duplicates you will get.

        I was addressing #2 in this thread and #1 in the other.

        -sauoq
        "My two cents aren't worth a dime.";