There are 36^10=3656158440062976 different 10 character words if you will use [0-9a-z] characters, and that would require quite a lot of disk space.
| [reply] [d/l] [select] |
True, but that is why I want to limit the numbers of characters to between 8 and 12. That would reduce the number considerably, and still make a pretty good dictionary.
| [reply] |
#!/usr/bin/perl
use 5.010;
use strict;
use warnings;
use Convert::AnyBase;
my $a = 0;
my $base = Convert::AnyBase->new(set => '02468acez');
say $base->encode($a++) while "there's some disk space";
| [reply] [d/l] |
I don't think you quite understood zwon's comment... There are 36^10 words composed of exactly 10 characters from [0-9a-z]. "Limiting" the number of characters to between 8 and 12 increases the number of combinations to 36^8 + 36^9 + 36^10 + 36^11 + 36^12, which adds up to 4873763581670522880 words.
Assuming an average of 10 characters per word (which is low, as there are many more 12-character combinations than 8-character combinations, but I'm not going to bother calculating the actual average length), plus a separator to divide them, that makes 4873763581670522880 * 11 = 53611399398375751680 bytes, or about 46.5 exabytes (or 48.8 million terabytes, if you prefer that unit).
Personally, I don't know anyone who has a few exabytes of spare disk sitting around to store all those words.
There's also the minor detail that, if you're generating a billion words per second, it would take 154 years to create all the 8-12 character combinations, even without writing them to disk.
| [reply] |