in reply to Re^11: PDL and srand puzzle - not likely fewer random bits than rand
in thread PDL and srand puzzle

Rmpfr_dump( Math::MPFR->new($r.at(0)) );

In executing that, you are assigning 15-significant-decimal-digit strings (eg 0.816545445367819) to a 53-bit precision float (double) ... and then observing that the assignment populates all 53 bits:
D:\>perl -MMath::MPFR=":mpfr" -MPDL -le "for(1..10) {$r = PDL->random( +); Rmpfr_dump(Math::MPFR->new($r.at(0)))}" 0.11101001100100000110110101110010100111100110010011000E-1 0.10110010100010010011101011001111010011110001011001001E-2 0.11101000000110110001010100000001011101100000111011111E-3 0.10110010010001101100111011100101000000001001100100011E0 0.11010000000101010100010000110000010010011110111101111E0 0.11101010010011100000100010101111011100001010000100111E-3 0.11001010100111101101000101101010111011100110010001111E-1 0.11111111111011111000010100111010010010001010101010110E-1 0.10111001110100001011011100010100001101011001101011010E-2 0.11011111011110110001110011111111011111010111110001100E-1
But you'll see the same thing if you use 15-significant-decimal-digit representations of values provided by rand():
D:\>perl -MMath::MPFR=":mpfr" -MPDL -le "for(1..10) {Rmpfr_dump(Math:: +MPFR->new(sprintf '%.15g', rand()))}" 0.11011001011111100110011011010010100010000000100011011E-3 0.10001011110110011101000101001110110001101111000000110E-1 0.11001011001111001101010100001100101010001000110000110E-2 0.11110111001110000011011110111000011000101111000111101E0 0.11110111100000100111010101111111100011010001010100010E0 0.10001110111001111100101011010111110111111111011111000E-1 0.10101100000000101101100111000011001011001010110111010E-1 0.11010110011101110001001101010011001011010001010111111E0 0.11111010001001001100100110011111101100110100011110010E-3 0.11010111110100100001010011100100000011000000000000100E-4
Yet, we already know (and you have just shown it) that rand() does not populate the 5 lowest bits.
This behaviour is inevitable when double-precision values are rounded to 15 decimal digit precision, and then assigned back in.
If 17 decimal digits of precision were provided, this capacity to mislead would go away:
D:\>perl -MMath::MPFR=":mpfr" -le "for(1..10) { Rmpfr_dump(Math::MPFR- +>new(sprintf '%.17g', rand()))}" 0.10110100110110100010000010011000101000100000001000000E-1 0.10001101011001100001100001100010110001101111000000000E-1 0.11111010010100100101001100000101001010100010001100000E0 0.11000011010000000101111010100100110001011110010000000E-1 0.11111000011001100111110000100011000110100010101000000E-1 0.11101100001000111011110010101111011111111110000000000E-3 0.10001001000110010110010011000011100101100101011100000E0 0.10100000101110001010101010001101001011010001011000000E0 0.11100000100010000001100011100101111101100110100100000E0 0.10000000011000110110000011000010000001100000000000000E-3
Unfortunately, there's not much to be gleaned from looking at double-precision values rounded to 15-significant-decimal-digit strings.
We need to be looking at how those values were derived.

Update: Mind you, I could be barking up the wrong tree, anyway - which is even more likely if this issue you've identified is limited to threading.

Cheers,
Rob

Replies are listed 'Best First'.
Re^13: PDL and srand puzzle - MCE v1.892, v1.893 updates
by marioroy (Prior) on Jun 08, 2024 at 05:55 UTC
    From etj: Could you take a look at the srand code and see if anything is obviously wrong? Also, are you able to see if the duplicates are in groups i.e. sequences?
    From Rob: We need to be looking at how those values were derived.

    I'm hoping for the PDL development team to chime in. The testing was out of curiosity. Running non-threads here, PDL generates approximately one billion unique. For threads (uncomment use threads at the top of the script), greater than 92% unique. Better than Math::Random::random_normal/random_uniform.

    Calling PDL::srandom has no effect for predictability, before spawning workers. I commented out the line in my last post. It requires calling CORE::srand instead.

    Care is needed whether to call a PDL function or class method.

    PDL::srandom(42); # not PDL->srandom(42) $r = PDL->random(); # not PDL::random();

    I completed validation on the MCE side. Spawning child processes (non-threads), nearly one billion unique > 99.99995% for PDL->random, Math::Prime::Util::drand(), and Math::Random::MT::Auto::rand(). 80% ~ 82% unique for Math::Random::random_uniform() and Math::Random::random_normal().

    Running threads, Math::Random::random_normal and random_uniform take beyond 2 minutes to output one billion lines and score less than 20% unique.

    The relevant code in MCE 1.891 can be found here lines 644-662 and here lines 2036-2071. MCE::Child and MCE::Hobo have similar code.

    See also, Random Numbers Overview by danaj.

    Update:

    It turns out that I can seed the generator inside threads for CORE, Math::Prime::Util, and Math::Random::MT::Auto and have predictable results, matching non-threads. I released MCE v1.892/v1.893, removing the check whether spinning threads. v1.893 preserves calling CORE::srand(N) for older Perl, non-threads.

    # Sets the seed of the base generator uniquely between workers. # The new seed is computed using the current seed and ID value. # One may set the seed at the application level for predictable # results (non-thread workers only). Ditto for Math::Prime::Util, # Math::Random, Math::Random::MT::Auto, and PDL. # # MCE 1.892, 2024-06-08 # Removed check if spawning threads i.e. use_threads. # Predictable output matches non-threads for CORE, # Math::Prime::Util, and Math::Random::MT::Auto. # https://perlmonks.org/?node_id=11159834 { my $_wid = $_args[1]; my $_seed = abs($self->{_seed} - ($_wid * 100000)) % 2147483560; CORE::srand($_seed) if (!$self->{use_threads} || $] ge '5.020000'); + # drand48 Math::Prime::Util::srand($_seed) if $INC{'Math/Prime/Util.pm'}; if (!$self->{use_threads}) { PDL::srand($_seed) if $INC{'PDL.pm'} && PDL->can('srand'); # PDL + 2.062 ~ 2.089 PDL::srandom($_seed) if $INC{'PDL.pm'} && PDL->can('srandom'); # + PDL 2.089_01+ } } if (!$self->{use_threads} && $INC{'Math/Random.pm'}) { my ($_wid, $_cur_seed) = ($_args[1], Math::Random::random_get_seed( +)); my $_new_seed = ($_cur_seed < 1073741781) ? $_cur_seed + (($_wid * 100000) % 1073741780) : $_cur_seed - (($_wid * 100000) % 1073741780); Math::Random::random_set_seed($_new_seed, $_new_seed); } if ($INC{'Math/Random/MT/Auto.pm'}) { my ($_wid, $_cur_seed) = ( $_args[1], Math::Random::MT::Auto::get_seed()->[0] ); my $_new_seed = ($_cur_seed < 1073741781) ? $_cur_seed + (($_wid * 100000) % 1073741780) : $_cur_seed - (($_wid * 100000) % 1073741780); Math::Random::MT::Auto::set_seed($_new_seed); }
      I'm hoping for the PDL development team to chime in.
      I honestly wonder who you think that is.