in reply to Question: Generate unique/random 12-digit keys for 25,000K records, howto??
2-digit unique/random keys against over 25,000K records.
Using rand and digests is wrong for this.
Sounds easy, but as the list of numbers you've used already grows, the process of finding one you haven't, takes longer and longer.
Whatever the digest you use, duplicates aren't not impossible, just very rare. But whatever the odds of you finding a duplicate are, they (at least) doubled with every bit you throw away.
Ie. If you start with a 128-bit (say md5) digest, convert it to a hex representation, 32-chars, and then truncate that to 12 chars, then that's 12/32 * 128 == 48 bits.
And 128 - 48 = 80, so your new 12-character identifiers are (at least) 2*80 or 1,208,925,819,614,629,174,706,176 times more likely to encounter duplicates.
However low the odds are to begin with, that's a huge increase in the risk.
By far the simplest way for any reasonably low number of values, 25k is very low, is to simply generate the N unique numbers, shuffle them and take the next one of the top of the list and assign it.
Guarenteed unique. Completely random association between the number and what it is assigned to. And trivial to code.
use List::Util qw[ shuffle ]; my @randUnique = shuffle map{ sprintf "1%011d", $_ } 1 .. 25e3;; print $randUnique[ $_ ] for 1 .. 10;; ## First 10 100000016682 100000002653 100000013669 100000004625 100000009482 100000002763 100000022284 100000000048 100000015278 100000012155
The arguement can be made that whilst the association between any given number and what it represents is unguessable, that guessing a valid number, regardless of what it represents is trivial. And if that is the major concern, then a small adjustment is needed.
my @randUnique = shuffle map{ sprintf "%07d%05d", 1e7+int( rand 1e6 ), $_ } 1 .. 25e3;; print $randUnique[ $_ ] for 1 .. 10;; 1091497817847 1018472205890 1028707802676 1078720009752 1016540524132 1074148507607 1022293016846 1018341020021 1038845808717 1056634512933
Now, guessing a valid number has a very low probablity (25e3/1e12 ~= 0.000000025), and even if the bad guy does succeed, there's no way for him to know what his guess represents. Just store your 25k (or a million; it takes no time to generate them):
print time(); my @randUnique = shuffle map{ sprintf "%07d%05d", 1e7+int( rand 1e6 ), $_ } 1 .. 1e6; print time();; 1209603392 1209603398 ## 6 seconds for 1e6
unique random numbers somewhere, and take the next one from the top of the pile as you need it.
|
|---|