AK108 has asked for the wisdom of the Perl Monks concerning the following question:

I am looking for advice on something that I term as a "stable shuffle". Basically, it's a shuffle, but with another parameter that tells it how to sort. This enables me to reproduce the shuffled result instead of storing it in the database.

It turns out that my routine (following) is horribly flawed and can't generate all permutations of an array.

sub stable_shuffle { # VERY BROKEN, do not use! # My intent was to turn the seed into a number, # and then use it to pick a predictably random # item off of the array. # my $seed = join '', map ord, split //, shift; # return map {splice(@_, $seed % @_, 1)} @_; }

The effect I want can be seen with something like this:

#!/usr/bin/perl use strict; use warnings; srand(1); # Or whatever known value. use List::Util 'shuffle'; my @arr = ('A' .. 'Z'); print "$_\n" for shuffle @arr;

Unfortunately, this method removes randomness from then on, which isn't acceptable. Reseeding is a possibility, but ugly. I have things later on that I want to be really random.

Does anyone have any ideas on how I could make a routine that's decent? I don't need something truly random. Thanks for any help you can give me!

Replies are listed 'Best First'.
Re: A reproducible shuffle? ("stable shuffle")
by moritz (Cardinal) on Feb 04, 2008 at 09:14 UTC
    I don't think that reseeding is ugly, since you can encapsulate it in a sub:
    use strict; use warnings; use List::Util qw(shuffle); sub my_shuffle { my $old_seed = rand(2**32); srand(1); my @shuffled = shuffle(@_); srand($old_seed); return @shuffled; } print my_shuffle(qw(a b c d e f g)), $/; print my_shuffle(qw(a b c d e f g)), $/; print my_shuffle(qw(a b c d e f g)), $/;

    Actually I'm not sure if the reseeding is done correctly.

    Another possibility is to take List::Util's shuffle, and replace calls to rand with your own pseudo-random number generator, that you can keep separate from rand.

      Reseeding doesn't need to be cryptographically perfect. srand(Time::HiRes::time() ^ $$ ^ $other_randomness) should be good enough. Also, I have access to /dev/random and /dev/urandom, though I am uncertain how random they are (shared webhost), and it might be a slow operation. ps aux or a similar command (as suggested by perldoc -f random) is useless on this particular webhost due to their implementation.

      This is for a game, though, so I may decide to implement a PRNG anyway. In that case, losing the random seed wouldn't matter.

        For a game the reseeding in my post is certainly enough.

        /dev/random usually contains really good random numbers, but reading it could block your scripts by seconds or even minutes if the machine is low on entropy.

        /dev/urandom is much faster, but contains less "real" entropy. So /dev/urandom or the last state from rand before setting the seed should be OK.

Re: A reproducible shuffle? ("stable shuffle")
by Corion (Patriarch) on Feb 04, 2008 at 08:49 UTC

    If you want to keep a shuffle persistent, I would supply the "shuffle order" as a config file and also allow the option of easily creating a new shuffle:

    use strict; use List::Util 'shuffle'; use GetOpt::Long; GetOptions( ... ); my @items = get_all_items; my @shuffle = shuffle 0..$#items; if ($save_shuffle_sequence) { open my $fh, ">", $save_shuffle_sequence or die "Couldn't save shuffle sequence to '$save_shuffle_seque +nce': $!"; print {$fh} "@shuffle"; }; if ($load_shuffle_sequence) { open my $fh, "<", $load_shuffle_sequence or die "Couldn't load shuffle sequence from '$load_shuffle_seq +uence': $!"; @shuffle = map { split /\s+/ }, <$fh>; }; my @shuffled_items = @items[ @shuffle ]; my @other_items_shuffled_in_parallel = @other_items[ @shuffle ];
Re: A reproducible shuffle? ("stable shuffle")
by BrowserUk (Patriarch) on Feb 04, 2008 at 09:39 UTC

    A "stable shuffle" is just an arbitrary arrangement that doesn't change. So, why not just produce one and then embed it in the script(s) that requires it. Using your example above:

    C:\test>perl -MList::Util=shuffle -wle"print join ' ', shuffle 'A'..'Z +'" Q I X H L R B C F E U M A S W J P K T O V N Y D G Z

    And then just c&p that into your code:

    ... my @arr = qw[ Q I X H L R B C F E U M A S W J P K T O V N Y D G Z ]; ...

    There you have a "stable shuffle" that stays stable until you change it. Without messing around with storing it in a DB or file.

    (I also find your reasoning for not using srand inexplicable, but that's been said, and this is simpler:)


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Well, it's not just one random order I need, but some arbitrary number that increases with time. Also, the items in the order could be changed or added to. Re^2: A reproducible shuffle? ("stable shuffle") explains better.

      As for wanting to keep srand() intact, I have other code that can occur in the same page load (web application) that needs a random value (or at least a good psuedorandom value).

Re: A reproducible shuffle? ("stable shuffle")
by dragonchild (Archbishop) on Feb 04, 2008 at 14:10 UTC
    What exactly are you attempting to accomplish? I'm a little confused because you, on one hand, want to randomly sort an array. But, on the other hand, you want to sort the array according to a known ordering. Those are (on the surface) incompatible goals.

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
      2. Yes, it's alot easier to assess correct code behaviour if you know the input... Of course it is important to have several, including patological, sequences of nonrandom for a good suite of tests. :o)
        Sure. But, you store your pathological sequences for testing, often in something like DBM::Deep. This didn't sound like a testing problem.

        My criteria for good software:
        1. Does it work?
        2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
      In the game I'm writing, we have a coordinate system (x,y,z, with z representing the planet number). Moving from one coordinate to another is slow, but an advanced ability allows players to "jump" to nearby coordinates (distance <5 or so) where certain criteria are met. (basically a player action needs to be performed on the starting point and the ending point.)

      For every one of those special coordinates, there is a list of the surrounding special coordinates. I don't want to directly give out which coordinates a certain "jump" results in, so instead they get a letter identifying each "jump" possible from the current spot.

      The idea here is that the coordinate is the randomness telling how to shuffle the coordinates. This way, the jumps from a given coordinate are consistent forever in the future so they could record them and say "oh, to get to (1, 2, 5) from (10, 2, 4) I can jump A, C, G, Y."

      The alternative is to generate the list of jumps once and store it. I do not want to do this, since the coordinate lists won't be used too often. The seed is already there in the form of the coordinates. The major problem, though, is the addition of a coordinate to the list of jumps. I would have to update every neighboring coordinate list. Removal would be a similar pain, and I'd need to ensure that the other items aren't disturbed.

      There are also other applications that I have in mind, but this is the one that would benefit from this the most.

        What about creating your own randomness algorithm? They're pretty simple to write and you don't need all the crazy stuff that most randomness algorithms need. Something like:
        sub next_random_number { my ($prev_number) = @_; $prev_number ||= 0; return ($prev_number + 14412312 ) / 123142131; }
        Now, that obviously wouldn't be good enough. But, you just need to generate and store the algorithm that comes up with the list, whatever that is.

        An alternative which comes to mind (given that you're only doing this every so often) is to spawn a new process to come up with the shuffled list and let that process have its random number generator set. That way, the parent process is still fine, but you're able to set the stuff properly.


        My criteria for good software:
        1. Does it work?
        2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: A reproducible shuffle? ("stable shuffle")
by roboticus (Chancellor) on Feb 04, 2008 at 14:02 UTC
    AK108:

    Unfortunately, this method removes randomness from then on, which isn't acceptable. Reseeding is a possibility, but ugly. I have things later on that I want to be really random.
    You can have both a stable shuffle and your randomness requirement by using two random number generators. Use one for the shuffle, reseeding to recreate the original shuffle. Use the other for all your other random needs...

    ...roboticus

Re: A reproducible shuffle? ("stable shuffle")
by pajout (Curate) on Feb 04, 2008 at 18:57 UTC
    What about something like this?

    #!/usr/bin/perl use strict; use warnings; use Digest::MD5 qw(md5); use Data::Dumper; my $salt='foo'; my @data = split('','abcdefghijklmnopqrstuvwxyz'); my @out = sort { md5($b.$salt) cmp md5($a.$salt) } @data; print "output ".Dumper(\@out);
      That works ideally. I think this is the best solution to my question.
OT Re: A reproducible shuffle? ("stable shuffle")
by ww (Archbishop) on Feb 04, 2008 at 14:03 UTC
    Can't resist... but a "stable shuffle" sounds a lot like "stacking the deck."
      It is, in a way. Thankfully for everyone involved, we're not playing cards. I was thinking of explaining this as a way to replay the dealings of a card game, but that's a bad analogy. This only requires psuedorandom ordering and the repeatibility is more important. Re^2: A reproducible shuffle? ("stable shuffle") explains better.
Re: A reproducible shuffle? (more seeds)
by tye (Sage) on Feb 04, 2008 at 18:34 UTC

    To create a bunch of reproducible shuffles, I'd write a shuffle (not particularly hard, you even have example source code for such on CPAN) that uses a Mersenne Twister. The seed space for rand is too small to cover the number of possible shuffles for even a fairly small number of items. A Mersenne Twister supports much larger seeds (and you can find implementations of that on CPAN as well) and so is more likely to be able produce many more of the possible shuffles.

    But I'm not convinced that a reproducible shuffle is the best choice for the game scenario you described.

    - tye        

Re: A reproducible shuffle? ("stable shuffle")
by Limbic~Region (Chancellor) on Feb 04, 2008 at 18:28 UTC
    AK108,
    Admittedly, I haven't read most of this thread. It sounds like what you want is shuffle() that takes 1 or 2 arguments. With one argument (the list), it will return a random permutation of a list along with an ID. If called with two arguments (the list and an ID), it will return the exact same permutation every time.

    If this sounds like what you want, there may be a way to create a rank()/unrank() routine for permutations. I have done it for combinations before. If nothing else, you could create a permutations iterator where the ID is the number of iterations to generate the desired shuffle. You could then just have the iterator loop ID times.

    Cheers - L~R

      Yes, that is what I have in mind. I'll take a while to digest the code and see if it will work.