use strict;
use warnings;
use Data::Dump qw/pp dd/;
my %vote = ( alligator => 100, bear => 90, cat => 80, jellyfish => 10
+);
my $N =0;
$N += $_ for values %vote;
# --- invert by division, proportion of non-voters
my $N2 = 0;
my %vote2 = map {
my $v2 = 1/$vote{$_};
$N2 += $v2;
$_ => $v2
} keys %vote;
# --- invert by subtraction, number of non-voters
my $N3 = 0;
my %vote3 = map {
my $v3 = $N-$vote{$_};
$N3 += $v3;
$_ => $v3
} keys %vote;
my (%dist1,%dist2,%dist3);
my $e=5;
$dist1{pick($N,\%vote)}++ for 1..10**$e;
$dist2{pick($N2,\%vote2)}++ for 1..10**$e;
$dist3{pick($N3,\%vote3)}++ for 1..10**$e;
pp \%dist1,\%dist2,\%dist3;
sub pick {
my ($N,$h_vote)=@_;
my @deb;
my $r =rand($N);
push @deb,$r;
scalar keys %$h_vote; # reset each
while ( my ($k,$v) = each %$h_vote) {
$r -= $v;
push @deb,[$r,$k,$v];
return $k if $r <=0;
}
die "ERROR", pp \@deb;
}
OUTPUT: (
{ alligator => 35891, bear => 31807, cat => 28792, jellyfish => 3510
+ },
{ alligator => 7497, bear => 8300, cat => 9481, jellyfish => 74722 }
+,
{ alligator => 21416, bear => 22542, cat => 23870, jellyfish => 3217
+2 },
)
On a tangent: had to debug why each failed me. Needed reset.
| [reply] [d/l] [select] |
The reciprocal approach definitely gives me what I had in mind, more so than computing the non-voters. That's what I'll use. Thanks.
(Incidentally, should your comment above read "invert by division, proportion of voters", or am I missing something?)
| [reply] |
> That's what I'll use. Thanks.
Your welcome, but you should also thank hv. :)
> (Incidentally, should your comment above read "invert by division, proportion of voters", or am I missing something?)
It's a matter of perspective, (proportion of non-voters) = 1-(proportion of voters)
If there is a better wording, I'd be happy to hear it.
| [reply] [d/l] |
Thank you - I had a hunch reciprocals might be the way to reverse popularity, but wasn't sure the 'popular()' algorithm would work with them. (Was never entirely comfortable with floating point arithmetic :-) | [reply] |
Well yeah, me neither: you know where you are with integers. :)
If it reaches the stage that the maths becomes important, you would certainly want to take a lot more care. One way to do that would be to avoid the simplistic "walk a list of (floating-point) weights" algorithm, and instead work to make each reciprocal exact: for each of the options in turn, give it an exact 1/votes chance to get chosen, and if it fails proceed to the next. On reaching the end of the list without a choice, start again (which doesn't necessarily have to be in the same order).
Getting an exact "1/votes" chance requires understanding a specific invariant of the RNG, the number of bits that it generates - exposed by perl via use Config; print $Config{randbits}. When something has 3 votes, for example, an RNG that generates 16 bits will slightly favour the value 0 (mod 3) by the ratio 21846:21845:21845. So for each option you would calculate $valid_lim = $votes * int(2**RNGbits / $votes) == 65535 in advance, and then re-roll any random value while ($rand = int(rand(2**$RNGbits))) >= $valid_lim before doing the modulus check choose_me() if ($rand % $votes) == 0, to remove that bias.
That means doing more work (and using up more of the available entropy), so you won't want to do this unless the vote requires this sort of critical accuracy.
| [reply] [d/l] [select] |
Point taken! However, given that I'm looking for ways to control the 'silliness' of a dissociated-press algorithm, critical accuracy isn't necessary. :-)
| [reply] |