in reply to Re^12: PDL and srand puzzle - predictability summary
in thread PDL and srand puzzle

Running threads, I'm unable to generate predictable results using PDL

Depends on the number of samples. I have the impression there is a somewhat rare condition triggering the bug. The more calls, the more probable its appearance.

Investigated several loop sizes and the number of identical results for several runs:

loops predictable result 512 100% 1024 60% 2048 20% 4096 0%

Greetings,
🐻

$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

Replies are listed 'Best First'.
Re^14: PDL and srand puzzle - PDL non-thread testing
by marioroy (Prior) on Jun 08, 2024 at 14:54 UTC

    This is a non-thread version. This exposes a PDL bug. That is PDL::srandom has no effect and must call CORE::srand(N) instead for predictable output. The relay block guarantees that worker1 outputs first, followed by worker2, and so on.

    Edit 1: MCE checks for PDL::Primitive->can('srand'), but missed checking PDL::Primitive->can('srandom'). Resolved in MCE v1.894 and MCE::Shared v1.889.

    Edit 2: MCE configures an internal seed. It turns out that MCE may not know the srand or setter used by the application. Releasing MCE 1.895 and MCE::Shared 1.890. I updated the demonstration to process a sequence of numbers (lesser memory consumption). See also, Predictability Summary.

    Loops 3276800

    #!/usr/bin/perl use v5.030; use PDL; use MCE 1.895; CORE::srand(3); # This also, for MCE predictable results # MCE sets internal seed = CORE::random() PDL::srandom(3); # PDL::srand(3) v1.062 ~ v1.089 MCE->new( use_threads => 0, # Ensure non-threads on Windows max_workers => 16, init_relay => 0, user_func => sub { my $output = ""; for (1..3276800) { my $r = random(); $output .= sprintf "%.72f\n", $r; } MCE::relay { print $output; }; } )->run;
    $ perl pdl-rand-mce.pl | wc -l 52428800 $ perl pdl-rand-mce.pl | cksum 3755051732 3932160000 $ perl pdl-rand-mce.pl | cksum 3755051732 3932160000 $ perl pdl-rand-mce.pl | cksum 3755051732 3932160000 $ perl pdl-rand-mce.pl | LC_ALL=C sort -u | wc -l $ perl pdl-rand-mce.pl | LC_ALL=C mcesort -j16 -u | wc -l $ perl pdl-rand-mce.pl | LC_ALL=C parsort --parallel=16 -u | wc -l 52428799

    Loops 3276800 Iterate Piddle

    #!/usr/bin/perl use v5.030; use PDL; use MCE 1.895; CORE::srand(3); # This also, for MCE predictable results # MCE sets internal seed = CORE::random() PDL::srandom(3); # PDL::srand(3) v1.062 ~ v1.089 MCE->new( use_threads => 0, # Ensure non-threads on Windows max_workers => 16, init_relay => 0, user_func => sub { my $output = ""; my $pdl = PDL->random(3276800); foreach (0 .. $pdl->nelem - 1) { my $r = $pdl->at($_); $output .= sprintf "%.72f\n", $r; } MCE::relay { print $output; }; } )->run;
    $ perl pdl-rand-mce2.pl | wc -l 52428800 $ perl pdl-rand-mce2.pl | cksum 1425016579 3932160000 $ perl pdl-rand-mce2.pl | cksum 1425016579 3932160000 $ perl pdl-rand-mce2.pl | cksum 1425016579 3932160000 $ perl pdl-rand-mce2.pl | LC_ALL=C sort -u | wc -l $ perl pdl-rand-mce2.pl | LC_ALL=C mcesort -j16 -u | wc -l $ perl pdl-rand-mce2.pl | LC_ALL=C parsort --parallel=16 -u | wc -l 52428799

    The parallel mcesort program is found at GitHub Gist. Another option is GNU parallel parsort.

      This exposes a PDL bug. That is PDL::srand/srandom has no effect and must call CORE::srand(N) instead for predictable output.
      Uh, no:
      $ perl -Mblib -MPDL -E 'say $PDL::VERSION' 2.089_01 $ perl -Mblib -MPDL -E 'CORE::srand(4); say PDL->random' 0.437817146098168 $ perl -Mblib -MPDL -E 'CORE::srand(4); say PDL->random' 0.0596849513730282 $ perl -Mblib -MPDL -E 'CORE::srand(4); say PDL->random' 0.876085322091164 $ perl -Mblib -MPDL -E 'CORE::srand(4); say PDL->random' 0.0193676520337621 $ perl -Mblib -MPDL -E 'srandom(4); say PDL->random' 0.923230081572141 $ perl -Mblib -MPDL -E 'srandom(4); say PDL->random' 0.923230081572141 $ perl -Mblib -MPDL -E 'srandom(4); say PDL->random' 0.923230081572141 $ perl -Mblib -MPDL -E 'srandom(4); say PDL->random' 0.923230081572141
        Uh, no:

        I'm referring to the parallel non-thread demonstration. It does not produce predictable results unless calling CORE::rand.

        Edit 1: MCE checks for PDL::Primitive->can('srand'), but missed checking PDL::Primitive->can('srandom'). I will release MCE 1.894 and MCE::Shared 1.889.

        From here: Calling srand cannot be a workaround for this, because there is no interaction between Perl's RNG and PDL's. They are separate systems.

        I was hoping for PDL::srandom to work before spinning non-threads and output predictable results.

        Edit 2: MCE may not know or assume the srand or setter function used by the application. Therefore, I reverted back to CORE::random() for MCE's internal seed. Releasing MCE 1.895 and MCE::Shared 1.890. Update: The parallel demonstration processes a sequence of numbers, consuming lesser memory consumption. See also, Predictability Summary.

        From here: I have just realised that Perl's "threads" functionality might clash with this, if it uses POSIX threads: PDL's RNG has a global vector of n seeds, in a global C variable. PDL's code will, if it is being used in what it thinks is single-threaded mode, use the 0th offset in that. If multiple POSIX threads are accessing that single seed (which gets updated on each RNG generation), there will be a race condition, hence less uniqueness.

        Thank you for clarity with regards to spinning threads.