in reply to Re^2: Is this a fair shuffle?
in thread Is this a fair shuffle?

I often wondered how using a random "constant" would change things but I never got around to trying it. It would also be good to add List::Util::shuffle to the benchmark. I assume it would beat my perl implementation hands down for small arrays, but mine is probably still quicker for large arrays because it operates in place?


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco.
Rule 1 has a caveat! -- Who broke the cabal?

Replies are listed 'Best First'.
Re^4: Is this a fair shuffle?
by Roy Johnson (Monsignor) on Apr 01, 2005 at 05:12 UTC
    I just looked again, and your shufl sorts things in-place. That's not fair. But it doesn't seem to make a lot of difference in the benchmark time. I also tried changing qshuf to not be inconsistent, like so:
    sub qshuf { my %hash; sort { $hash{$a}{$b}||=(.5 <=> rand 1) } @_; }
    That is, it caches the value for each pairwise comparison, but it still doesn't shuffle well. Of course, there is the possibility of a > b > c > a...

    Update: I have a winner!

    sub qshuf { my %hash; sort { ($hash{$a}||=rand 1) <=> ($hash{$b}||=rand 1) } @_; } Rate xform slice shufl qshuf xform 65.6/s -- -34% -72% -100% slice 99.7/s 52% -- -57% -100% shufl 233/s 255% 133% -- -99% qshuf 44494/s 67764% 44534% 19019% -- permutation | slice | xform | shufl | qshuf -------------------------------------------------- Std. Dev. | 69.974 | 58.895 | 53.532 | 62.689

    Caution: Contents may have been coded under pressure.
      sub qshuf { my %hash; sort { ($hash{$a}||=rand 1) <=> ($hash{$b}||=rand 1) } @_; }
      That's close, but try it on a list with repetitions:
      for (1 .. 10) { print qshuf(qw/a b c D D D D/), $/; }
      All the D's in the list get the same random value, so they all tie each other in the comparisons and end up adjacent. You are better off using the Schwartzian Transform I outlined above, where each array position gets a unique random number.

      blokhead

      And you really think numbers like "19019%" mean you have a brilliant algorithm? Just accept that, don't ask any questions?

      Think again. Next time, try to run your sort in a list context instead of a void context (when Perl will just optimize the entire sort away) and see what happens:

      #!/usr/bin/perl -w use strict; use Benchmark qw(cmpthese); use Data::Dumper; use List::Util qw /shuffle/; sub xform { map {$_->[0]} sort { $a->[1] <=> $b->[1]} map {[$_, rand(1)]} @_; } sub slice { my @random; push @random, rand 1 for 0 .. $#_; @_[ sort { $random[$a] <=> $random[$b] } 0 .. $#_ ]; } sub shufl { $a = $_ + rand @_ - $_ and @_[$_, $a] = @_[$a, $_] for (0..$#_); return @_; } sub qshuf { my %hash; sort { ($hash{$a}||=rand 1) <=> ($hash{$b}||=rand 1) } @_; } my @array = 1 .. 1000; cmpthese(-10, { slice => sub { () = slice @array }, xform => sub { () = xform @array }, shufl => sub { () = shufl @array }, qshuf => sub { () = qshuf @array }, lutil => sub { () = shuffle @array }, }); __END__ Rate qshuf xform slice shufl lutil qshuf 90.7/s -- -12% -36% -66% -97% xform 103/s 13% -- -27% -61% -96% slice 141/s 55% 37% -- -47% -95% shufl 264/s 191% 157% 87% -- -91% lutil 2870/s 3066% 2694% 1939% 988% --
      Not a winner. Not anywhere near the winner either.
      That's not fair.

      I guess it depends upon what the aim is. If you simple want an array shuffled quicky, doing it in place is totally fair.

      But it doesn't seem to make a lot of difference in the benchmark time.

      As the arrays being shuffled only have 4 elements, the in-placeness makes almost no difference, but once you start shuffling arrays of a few hundred items, it has a fairly dramatic effect.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco.
      Rule 1 has a caveat! -- Who broke the cabal?
        Having one algorithm that shuffles in-place and the others that don't is not a fair comparison. As Abigail points out, the void context is a problem; they results should be assigned somewhere. Then the dramatic advantage of sort disappears.

        The speed test wasn't done on 4-element arrays, BTW.

        Odd, all the nodes in this thread list as having negative reps, but in my Nodes you Wrote, they're positive. And I gained XP overnight.


        Caution: Contents may have been coded under pressure.
Re^4: Is this a fair shuffle?
by tlm (Prior) on Apr 01, 2005 at 13:37 UTC

    It would also be good to add List::Util::shuffle to the benchmark. I assume it would beat my perl implementation hands down for small arrays, but mine is probably still quicker for large arrays because it operates in place?

    If you are referring to the Perl implementation of List::Util::shuffle, then yes a Perl implementation of FY (like yours) wins, but naturally the C implementation of List::Util::shuffle (which is the default) handily beats a Perl implementation of FY (lup = List::Util::shuffle in Perl; luc = List::Util::shuffle in C; fyp = Fisher-Yates in Perl):

    Rate lup fyp luc lup 1.64/s -- -12% -83% fyp 1.87/s 14% -- -80% luc 9.43/s 476% 406% --
    That's for sorting an array of 105 elements.

    the lowliest monk