Not that there aren't a bajillion ways to do this and the other posts are right but this looked fun..
mkpasswd as default (on rh9 linux) gives you 9 alphanumeric digits. But a simple perl -e test even without storing in a hash to eliminate dupes would still take 40 hours on my (not that old) laptop. So I wont even look at md5sum's output..
I like Abigail-II's suggestion, since there is no need to check for dupes as it is already a sequence. You could devise a large number of mathematical algorithms which will give you a good sequence without zeroes. Actually pi might be kind of fun here, but you will I think have to check for duplicate codes then.
By the way, in the following I'm using capital hex digits. But if you used the 10 numbers and 26 letters of the alphabet you probably would have plenty of values to algorithmically detect whether the code is legal or not without actually checking your database of 1.5 million codes, if you create codes with an algorithm like the credit card companies. You can google for "credit card validation" algorithms if you want to. Well here's a link.
Okay, back to reality. Another way might be to open /dev/random and reading 9 digits at a time. Possibly you could create the file without perl as YAML (to quote a recent thread) and deserialize? That would be neat (read "masochistic").
Well I tried it with /dev/random on the command line and guess what? My laptop runs out of random numbers pretty quickly! (The listing continues when you move the mouse around..) So I tried /dev/audio though maybe you will prefer to just read a track off an audio CD.
The xxd command prints 5 hex two-digit numbers, and cut just takes the first 9 digits as you requested. I didn't get many duplicates.
This is what I used..
[mattr@taygeta mattr]$ time perl -e '
$max=100000; $c=0;
open(OUT,">codes");
open (IN,"cat /dev/audio | xxd -u -ps -c 5 |cut -b 1-9 |");
while (<IN>) {
$l++;
if(!exists($h{$_})) {
$h{$_}=1; $c++; chomp; print "$c\n" unless $c%100;
}
if ($c>$max) {
print "$l lines $c uniques found\n"; close(IN);
foreach (keys %h) {print OUT "$_";} close(OUT); exit;}
}'
...
99800
99900
100000
107849 lines 100001 uniques found
real 0m18.977s
user 0m3.720s
sys 0m0.190s
So at this rate it will take under 5 minutes to get 1.5 million codes. With this or other solutions you can make some character substitutions if you want to make it look nicer.
File "codes" contains stuff like (actual data):
C5FF33FF3
FF2F0082F
FFE2FE5AF
6800EF004
99FF35006
20003101A
F3FF7DFFC
FF14FF46F
Now that looks very repetitive but my only explanation is that you only need 5 hex digits to count to a million and we are getting 9, so there is plenty of room.
I'm surprised it has so few dupes actually, I'd be interested if anyone could tell how random this is with a one of those hash visualization packages used to test crypto strength of a hash algorithm, if there is one around.
Note: I found a quiet office gave 7% dupes but banging on the laptop case brought it down to 4%. So maybe you should sing to your perl code, so that it you know, runs better.
I have no sig so why does this div show up..
|