A 12 digit hex string can have 281_474_976_710_656 different values, so it's certainly capable of representing 50_000_000 different strings. That said, Digest::SHA1 does not guarantee that two different strings will result in different hashes. Quite the opposite. If all 50M inputs are all different outputs, there's a certain degree of luck involved. (I'm too lazy to compute for you the probability of a collision, but it looks pretty small.)

If you want the output to be exactly as unique as the input, you'll have to come up with some kind of encoding. That is, it will be possible to go from the encoded version back to the original string.

The problem is that your inputs have a significantly larger alphabet than the 16 hex digits, and they're longer than the 12 digits you want to store. (I'm guessing a little here because you haven't told us anything about the format of the inputs.) As such, I think it's safe to say there's no way to encode every possible input string into the output format you want. There just isn't enough "space" to do it in.

That leaves two options, in my opinion.

  1. Use a hashing algorithm and rely on luck.
  2. Enumerate the inputs with a serial number until you run out of them (i.e., let the database do it).

In reply to Re^3: Question: Generate unique/random 12-digit keys for 25,000K records, howto?? by kyle
in thread Question: Generate unique/random 12-digit keys for 25,000K records, howto?? by lihao

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.