Re: Algorithm RFC: fast (pseudo-)random shuffle with no repetition

Replies are listed 'Best First'.
Re^2: Algorithm RFC: fast (pseudo-)random shuffle with no repetition by tybalt89 (Monsignor) on Sep 23, 2023 at 05:13 UTC
Either golfed or just shorter, your choice :) #!/usr/bin/perl use strict; # https://www.perlmonks.org/?node_id=11154597 use warnings; use List::AllUtils qw( first shuffle ); my @in = shuffle <DATA>; my @out = shift @in; LOOP: while( @in ) { for my $in ( 0 .. $#in ) { if( defined( my $place = first { $_ == 0 ? $in[$in] ne $out[0] : # at start? $_ == @out ? $in[$in] ne $out[-1] : # at end? $out[$_-1] ne $in[$in] && $out[$_] ne $in[$in] # in middle? } 0 .. @out ) ) { splice @out, $place, 0, splice @in, $in, 1; next LOOP; } } die "FAILED\nin\n @in\nout\n @out"; } print @out; __DATA__ Line 1 Line 2 Line 3 Line 3 Line 3 Line 4 Line 8 [download]	[reply] [d/l]
Re^2: Algorithm RFC: fast (pseudo-)random shuffle with no repetition (constructive approach) by LanX (Saint) on Sep 23, 2023 at 12:49 UTC
> It only found 240 different orders in 1e7 runs. yes, there are only `240 = 10 * 4!` distinct solutions for this one. `qw/1 2 3 3 3 4 8/` EDIT There are 24= 4! possible permutations for lines 1,2,4 and 8 There are 10 possibilities to partition them into 4 groups to fill around the line 3s `(.)3(.)3(.)3(.)` where only the first and last partition is allowed to be empty, otherwise the 3 would collide. `(1)(2)(4)(8) ()(12)(4)(8) ()(1)(24)(8) ()(1)(2)(48) (12)(4)(8)() (1)(24)(8)() (1)(2)(48)() ()(1)(248)() ()(12)(48)() ()(124)(8)()` [download] NB: since 1,2,4 and 8 are distinct they can't collide inside the fillers. update I think this constructive approach is the best way to allow all possible solutions to appear and with the same likelihood. Because one only needs two random numbers `rnd(24)` and `rnd(10)` as "coordinates" to construct one. Cheers Rolf _{(addicted to the Perl Programming Language :) see Wikisyntax for the Monastery}	[reply] [d/l] [select]
Re^2: Algorithm RFC: fast (pseudo-)random shuffle with no repetition by Anonymous Monk on Sep 23, 2023 at 10:37 UTC
Thanks, looks like I overcomplicated it a lot, this is much faster and simpler. Perhaps because there was no answer like yours at SO? :-) FWIW, if compressibility (`Compress::Zlib::compress`) can serve as measure of entropy, then deflating joined output from 100 runs for the same 2096 items array input -- produces string of practically same length as my solution. So, yeah, one initial shuffle and then unshift/push/first seem OK to ensure enough randomness (and speed).	[reply] [d/l]
Re^3: Algorithm RFC: fast (pseudo-)random shuffle with no repetition by Anonymous Monk on Sep 25, 2023 at 09:17 UTC
if compressibility can serve as measure of entropy... Very-pseudo... Call it mad science, but I was quite amused with what follows. Let's consider 2 solutions to OP task: supf i.e. "shuffle/unshift/push/first" mrcs i.e. "matrix rows/columns shuffle" The former by tybalt98, code almost verbatim from 11154602, and the idea for the latter suggested by LanX. Other than "(almost) equidistant most frequent key(s)", I had no solid basis to say "very-pseudo", but gut feeling only. The `mrcs` is, as I said, quite visual and therefore e.g. OK as a demo to teach students. I didn't add error checking, neither optimized for speed (it won't become faster than `supf`, but fast enough already). Here's "result predictability" measurement: Looks like "supf" provides 100% of possible entropy Looks like "mrcs" provides 67% of possible entropy This is for entertainment only, I don't think I'm qualified for "what's randomness?" discussion. use strict; use warnings; use feature 'say'; use List::Util qw/ first shuffle max /; use Compress::Zlib 'compress'; use POSIX 'ceil'; use constant ITEM => qr(< \d+ >); my @input = fake_data( 555, 55 ); #say scalar @input; # 2096 my $low_mark = gauge(( \@input ) x 100 ); my $high_mark = gauge( map [ shuffle @input ], 0 .. 99 ); my $supf_mark = gauge( map verify( supf( \@input )), 0 .. 99 ); my $mrcs_mark = gauge( map verify( mrcs( \@input )), 0 .. 99 ); printf "Looks like \"supf\" provides %.0f%% of possible entropy\n", 100 * ( $supf_mark - $low_mark ) / ( $high_mark - $low_mark ); printf "Looks like \"mrcs\" provides %.0f%% of possible entropy\n", 100 * ( $mrcs_mark - $low_mark ) / ( $high_mark - $low_mark ); sub gauge { length compress join '', map @$_, @_ } sub verify { my $r = shift; my $s = join '', @$r; die if $s =~ /(ITEM)\1/; return $r } sub fake_data { my ( $uniq, $pivot ) = @_; my @tmp = map { sprintf '< %09d >', rand 1e9 } 0 ... $uniq; my @out; push @out, @tmp[ 0 .. $_ ] for 0 .. $pivot; @out = shuffle( @out, @tmp[ $pivot + 1 .. $uniq ]); return @out } sub supf { # shuffle/unshift/push/first my $input_ref = shift; my @in = shuffle @$input_ref; my @out = shift @in; LOOP: while( @in ) { for my $in ( 0 .. $#in ) { if( $in[$in] ne $out[0] ) # put at start { unshift @out, splice @in, $in, 1; next LOOP; } elsif( $in[$in] ne $out[-1] ) # put at end { push @out, splice @in, $in, 1; next LOOP; } else # put in first middle place it fits { if( my $pick = first { $out[$_ - 1] ne $in[$in] and $out[$_] ne $in[$in] } 1 .. $ +#out ) { splice @out, $pick, 0, splice @in, $in, 1; next LOOP; } } } die "FAILED\nin\n @in\nout\n @out"; } return \@out; } sub mrcs { # matrix rows/columns shuffle my $input_ref = shift; my %counts; $counts{ $_ } ++ for @$input_ref; my $nrows = max values %counts; my $ncols = ceil @$input_ref / $nrows; my @frequent_keys = grep { $counts{ $_ } == $nrows } keys %counts; delete @counts{ @frequent_keys }; my @arranged = (( map {( $_ ) x $nrows } @frequent_keys ), ( map {( $_ ) x $counts{ $_ }} keys %counts )); $#arranged = $nrows * $ncols - 1; my @matrix = map [ splice @arranged, 0, $nrows ], 0 .. $ncols - 1; my @row_perm = shuffle 1 .. $ncols - 1; # exclude 0th column my @col_perm = shuffle 0 .. $nrows - 1; for my $col ( @matrix ) { splice @$col, 0, $nrows, @$col[ @col_perm ]; } splice @matrix, 1, $ncols - 1, @matrix[ @row_perm ]; my @ret = grep defined, map { # transpose & flatten my $row_i = $_; map { $matrix[ $_ ][ $row_i ]} 0 .. $ncols - 1 } 0 .. $nrows - 1; return \@ret; } [download]	[reply] [d/l] [select]

EDIT

update