in reply to Algorithm RFC: fast (pseudo-)random shuffle with no repetition

Second pass at it - Now for something completely different...

#!/usr/bin/perl use strict; # https://www.perlmonks.org/?node_id=11154597 use warnings; use List::AllUtils qw( first shuffle ); my @in = shuffle <DATA>; my @out = shift @in; LOOP: while( @in ) { for my $in ( 0 .. $#in ) { if( $in[$in] ne $out[0] ) # put at start { unshift @out, splice @in, $in, 1; next LOOP; } elsif( $in[$in] ne $out[-1] ) # put at end { push @out, splice @in, $in, 1; next LOOP; } else # put in first middle place it fits { if( my $pick = first { $out[$_ - 1] ne $in[$in] and $out[$_] ne $in[$in] } 1 .. $#out + ) { splice @out, $pick, 0, splice @in, $in, 1; next LOOP; } } } die "FAILED\nin\n @in\nout\n @out"; } print @out; __DATA__ Line 1 Line 2 Line 3 Line 3 Line 3 Line 4 Line 8

BTW: It only found 240 different orders in 1e7 runs.

  • Comment on Re: Algorithm RFC: fast (pseudo-)random shuffle with no repetition
  • Download Code

Replies are listed 'Best First'.
Re^2: Algorithm RFC: fast (pseudo-)random shuffle with no repetition
by tybalt89 (Monsignor) on Sep 23, 2023 at 05:13 UTC

    Either golfed or just shorter, your choice :)

    #!/usr/bin/perl use strict; # https://www.perlmonks.org/?node_id=11154597 use warnings; use List::AllUtils qw( first shuffle ); my @in = shuffle <DATA>; my @out = shift @in; LOOP: while( @in ) { for my $in ( 0 .. $#in ) { if( defined( my $place = first { $_ == 0 ? $in[$in] ne $out[0] : # at start? $_ == @out ? $in[$in] ne $out[-1] : # at end? $out[$_-1] ne $in[$in] && $out[$_] ne $in[$in] # in middle? } 0 .. @out ) ) { splice @out, $place, 0, splice @in, $in, 1; next LOOP; } } die "FAILED\nin\n @in\nout\n @out"; } print @out; __DATA__ Line 1 Line 2 Line 3 Line 3 Line 3 Line 4 Line 8
Re^2: Algorithm RFC: fast (pseudo-)random shuffle with no repetition (constructive approach)
by LanX (Saint) on Sep 23, 2023 at 12:49 UTC
    > It only found 240 different orders in 1e7 runs.

    yes, there are only 240 = 10 * 4! distinct solutions for this one. qw/1 2 3 3 3 4 8/

    EDIT

    There are 24= 4! possible permutations for lines 1,2,4 and 8

    There are 10 possibilities to partition them into 4 groups to fill around the line 3s

    (.)3(.)3(.)3(.)

    where only the first and last partition is allowed to be empty, otherwise the 3 would collide.

    (1)(2)(4)(8) ()(12)(4)(8) ()(1)(24)(8) ()(1)(2)(48) (12)(4)(8)() (1)(24)(8)() (1)(2)(48)() ()(1)(248)() ()(12)(48)() ()(124)(8)()

    NB: since 1,2,4 and 8 are distinct they can't collide inside the fillers.

    update

    I think this constructive approach is the best way to allow all possible solutions to appear and with the same likelihood.

    Because one only needs two random numbers rnd(24) and rnd(10) as "coordinates" to construct one.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

Re^2: Algorithm RFC: fast (pseudo-)random shuffle with no repetition
by Anonymous Monk on Sep 23, 2023 at 10:37 UTC

    Thanks, looks like I overcomplicated it a lot, this is much faster and simpler. Perhaps because there was no answer like yours at SO? :-) FWIW, if compressibility (Compress::Zlib::compress) can serve as measure of entropy, then deflating joined output from 100 runs for the same 2096 items array input -- produces string of practically same length as my solution. So, yeah, one initial shuffle and then unshift/push/first seem OK to ensure enough randomness (and speed).

      if compressibility can serve as measure of entropy...
      Very-pseudo...

      Call it mad science, but I was quite amused with what follows. Let's consider 2 solutions to OP task:

      • supf i.e. "shuffle/unshift/push/first"
      • mrcs i.e. "matrix rows/columns shuffle"

      The former by tybalt98, code almost verbatim from 11154602, and the idea for the latter suggested by LanX. Other than "(almost) equidistant most frequent key(s)", I had no solid basis to say "very-pseudo", but gut feeling only. The mrcs is, as I said, quite visual and therefore e.g. OK as a demo to teach students. I didn't add error checking, neither optimized for speed (it won't become faster than supf, but fast enough already). Here's "result predictability" measurement:

      Looks like "supf" provides 100% of possible entropy
      Looks like "mrcs" provides 67% of possible entropy
      

      This is for entertainment only, I don't think I'm qualified for "what's randomness?" discussion.

      use strict; use warnings; use feature 'say'; use List::Util qw/ first shuffle max /; use Compress::Zlib 'compress'; use POSIX 'ceil'; use constant ITEM => qr(< \d+ >); my @input = fake_data( 555, 55 ); #say scalar @input; # 2096 my $low_mark = gauge(( \@input ) x 100 ); my $high_mark = gauge( map [ shuffle @input ], 0 .. 99 ); my $supf_mark = gauge( map verify( supf( \@input )), 0 .. 99 ); my $mrcs_mark = gauge( map verify( mrcs( \@input )), 0 .. 99 ); printf "Looks like \"supf\" provides %.0f%% of possible entropy\n", 100 * ( $supf_mark - $low_mark ) / ( $high_mark - $low_mark ); printf "Looks like \"mrcs\" provides %.0f%% of possible entropy\n", 100 * ( $mrcs_mark - $low_mark ) / ( $high_mark - $low_mark ); sub gauge { length compress join '', map @$_, @_ } sub verify { my $r = shift; my $s = join '', @$r; die if $s =~ /(ITEM)\1/; return $r } sub fake_data { my ( $uniq, $pivot ) = @_; my @tmp = map { sprintf '< %09d >', rand 1e9 } 0 ... $uniq; my @out; push @out, @tmp[ 0 .. $_ ] for 0 .. $pivot; @out = shuffle( @out, @tmp[ $pivot + 1 .. $uniq ]); return @out } sub supf { # shuffle/unshift/push/first my $input_ref = shift; my @in = shuffle @$input_ref; my @out = shift @in; LOOP: while( @in ) { for my $in ( 0 .. $#in ) { if( $in[$in] ne $out[0] ) # put at start { unshift @out, splice @in, $in, 1; next LOOP; } elsif( $in[$in] ne $out[-1] ) # put at end { push @out, splice @in, $in, 1; next LOOP; } else # put in first middle place it fits { if( my $pick = first { $out[$_ - 1] ne $in[$in] and $out[$_] ne $in[$in] } 1 .. $ +#out ) { splice @out, $pick, 0, splice @in, $in, 1; next LOOP; } } } die "FAILED\nin\n @in\nout\n @out"; } return \@out; } sub mrcs { # matrix rows/columns shuffle my $input_ref = shift; my %counts; $counts{ $_ } ++ for @$input_ref; my $nrows = max values %counts; my $ncols = ceil @$input_ref / $nrows; my @frequent_keys = grep { $counts{ $_ } == $nrows } keys %counts; delete @counts{ @frequent_keys }; my @arranged = (( map {( $_ ) x $nrows } @frequent_keys ), ( map {( $_ ) x $counts{ $_ }} keys %counts )); $#arranged = $nrows * $ncols - 1; my @matrix = map [ splice @arranged, 0, $nrows ], 0 .. $ncols - 1; my @row_perm = shuffle 1 .. $ncols - 1; # exclude 0th column my @col_perm = shuffle 0 .. $nrows - 1; for my $col ( @matrix ) { splice @$col, 0, $nrows, @$col[ @col_perm ]; } splice @matrix, 1, $ncols - 1, @matrix[ @row_perm ]; my @ret = grep defined, map { # transpose & flatten my $row_i = $_; map { $matrix[ $_ ][ $row_i ]} 0 .. $ncols - 1 } 0 .. $nrows - 1; return \@ret; }