hangon has asked for the wisdom of the Perl Monks concerning the following question:

I need to randomize large integers (> 32K). I looked at some modules, but I just need to call $foo = rand $bar repeatedly without setting up objects & parameters. So, I came up with the following scheme to extend the range of rand to around 1G. Basically, it breaks $bar into blocks of nearly equal size (within +1 -0) and calls rand twice, once to select a block, and again to select the number within the block, then it totals everything up. It appears to DWIM and seems a reasonable approximation of rand (the distribution of numbers generated compares closely to rand), but I don't have the expertize to test it rigorously.

It's to be used for simulation, and doesn't need to be perfect. Does this seem like a reasonable approach? Am I overlooking any potential problems?

sub bigrand{ my $in = shift; my $limit = 32768; if ($in <= $limit){ return int rand $in; } my $blockSize = int($in ** .5); my $blockSelect = int rand int ($in/$blockSize); my $leftOver = $in % $blockSize; if ($blockSelect < $leftOver){ return ($blockSelect * ($blockSize + 1)) + int rand ($blockSize + 1); }else{ return ($leftOver * ($blockSize + 1)) + (($blockSelect - $leftOver) * $blockSize) + int rand ($blockSize); } }

Update: It's been an education. Up to now, rand is one of those functions I've just used and taken for granted. Thanks especially to BrowserUK for the explanation and links to other nodes, and to tachyon-II whose module Math::Random::MT::Perl I ended up using.

Replies are listed 'Best First'.
Re: Randomizing Large Integers
by BrowserUk (Patriarch) on May 19, 2008 at 18:00 UTC

      A number of programs have already been written for this simulation, so I actually need to override rand globally using one of its common modules. Math::Random::MT would probably be perfect except for the umm... "political" issues regarding compilers on this system. Thanks for the suggestion though.

        I'm assuming you're on a Win32 system (you mention >32k integers. Win32 CRT rand is limited to 15-bits of output. Most other systems (*nix at least) have 48-bit rands in their CRTs). If that assumption is correct, you might be able to bypass the compiler issue by doing a binary install from PPD.

        You don't indicate how rigourous your random requirements are, but the problem with deriving more bits of randomness from less is that the process invariably acts to compound the flaws in the underlying PRNG. If you plot the values from the Win32 Perl rand function on a 2D grid (jpg or png) and increment the color of the pixel each time you generate the same value, you very quickly see distinct banding appear where the distribution of the output is very non-uniform. If you combine random values from a flawed PRNG in an attempt to increase teh number of bits, it almost invariably compounds the banding effect and leads to a highly skewed output distribution.

        Essentially, you cannot manufacture more bits of randomness, only re-use the existing ones, and that leads to a Doppler effect. The rarified (below random chance) parts of the distribution tend to get further rarified, whilst the above average parts of the distribution tend to combine and reinforce each other leading to very uneven distributions.

        Maybe this doesn't affect your application, but if your algorithm depends upon an even distribution for its operation (eg. Monte Carlo Simulations), then you'd be well advise to seek a good PRNG up front rather than discovering that you are skewing your results from the outset.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        Here is a pure Perl version of Math::Random::MT::Perl. About 1/3-1/2 as fast as the C/XS but still good for 100,000 well distributed 32 bit random numbers per second. Output identical to the C version. Coming to Latest version at Math::Random::MT::Perl

Re: Randomizing Large Integers
by mscharrer (Hermit) on May 19, 2008 at 18:25 UTC
    Is 32K a limitation of the standard rand function? I figured that in my installation (v5.8.8 built for i486-linux-gnu-thread-multi) rand works for given numbers up to 2**32. But I couldn't find any documentation for the limit. A 32 bit limitation seems logical, however. In my case I needed random hex numbers with 2**N limits, so I just concatenated multiple rand calls using printf. On Linux you also can read from /dev/urandom to get random bytes.
      Is 32K a limitation of the standard rand function?

      The macro (uconfig.h, line 3368 in 5.8.8 or 4057 in 5.10.0) looks like only 15 bits are being used:

      #define Drand01() ((rand() & 0x7FFF) / (double) ((unsigned long)1 << +15)) ^^^^

      Update: though, a somewhat closer look suggests this might be configuration dependent (at least, Configure checks for drand48()), so maybe I spoke too soon...

      Update2: just ran a test build on a system which has drand48()... et voila, the macro ends up being defined as

      #define Drand01() drand48()

      So, as often, the answer is: it depends :)

        15 bit of the input or of the output?

        I get the following behavior that lets me assume that 32 bit random integers work:

        # perl -e 'printf "%010x\n", int rand 2**16' 00000052e2 # perl -e 'printf "%010x\n", int rand 2**32' 006f7ae0c5 # perl -e 'printf "%010x\n", int rand 2**64' 00ffffffff

        i.e. rand 2**N seems to return a N bit random integer, up to N = 32 but not above.

      From what I've read, rand is dependent on whats installed on the OS. On this system, when feeding rand with integers > 32768, the distribution of values returned (from multiple calls using the same number) breaks down badly. So this seems to support 15 bits.

        Running perl -V:randbits will tell you how many random bits your perl's rand() has.

        Cheers,
        Rob