Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: Randomly choosing from a set of alternatives with varying popularity

by hv (Prior)
on Mar 28, 2022 at 03:28 UTC ( [id://11142439]=note: print w/replies, xml ) Need Help??


in reply to Randomly choosing from a set of alternatives with varying popularity

One approach for favouring the unpopular answers would be to weight by the reciprocal of the votes, so jellyfish would be 10 times likelier to show up than alligator. Another would be to replace $v votes by $max + $min - $v (which would have the same jellyfish to alligator ratio, but would weight the middle choices differently).

Note that the reciprocal approach can use the same algorithm as 'popular()' if you remove the unnecessary int(...) from its first line. The min/max approach would work with or without that change.

Replies are listed 'Best First'.
Re^2: Randomly choosing from a set of alternatives with varying popularity
by LanX (Saint) on Mar 28, 2022 at 11:32 UTC
    > weight by the reciprocal of the votes

    FWIW, here my attempts with different reciprocal functions and evaluation of distribution. See also Re^3: Randomly choosing from a set of alternatives with varying popularity for interpretation.

    use strict; use warnings; use Data::Dump qw/pp dd/; my %vote = ( alligator => 100, bear => 90, cat => 80, jellyfish => 10 +); my $N =0; $N += $_ for values %vote; # --- invert by division, proportion of non-voters my $N2 = 0; my %vote2 = map { my $v2 = 1/$vote{$_}; $N2 += $v2; $_ => $v2 } keys %vote; # --- invert by subtraction, number of non-voters my $N3 = 0; my %vote3 = map { my $v3 = $N-$vote{$_}; $N3 += $v3; $_ => $v3 } keys %vote; my (%dist1,%dist2,%dist3); my $e=5; $dist1{pick($N,\%vote)}++ for 1..10**$e; $dist2{pick($N2,\%vote2)}++ for 1..10**$e; $dist3{pick($N3,\%vote3)}++ for 1..10**$e; pp \%dist1,\%dist2,\%dist3; sub pick { my ($N,$h_vote)=@_; my @deb; my $r =rand($N); push @deb,$r; scalar keys %$h_vote; # reset each while ( my ($k,$v) = each %$h_vote) { $r -= $v; push @deb,[$r,$k,$v]; return $k if $r <=0; } die "ERROR", pp \@deb; }
    OUTPUT:
    ( { alligator => 35891, bear => 31807, cat => 28792, jellyfish => 3510 + }, { alligator => 7497, bear => 8300, cat => 9481, jellyfish => 74722 } +, { alligator => 21416, bear => 22542, cat => 23870, jellyfish => 3217 +2 }, )

    On a tangent: had to debug why each failed me. Needed reset.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

      The reciprocal approach definitely gives me what I had in mind, more so than computing the non-voters. That's what I'll use. Thanks.

      (Incidentally, should your comment above read "invert by division, proportion of voters", or am I missing something?)

        > That's what I'll use. Thanks.

        Your welcome, but you should also thank hv. :)

        > (Incidentally, should your comment above read "invert by division, proportion of voters", or am I missing something?)

        It's a matter of perspective, (proportion of non-voters) = 1-(proportion of voters)

        If there is a better wording, I'd be happy to hear it.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

Re^2: Randomly choosing from a set of alternatives with varying popularity
by ibm1620 (Hermit) on Mar 29, 2022 at 22:43 UTC
    Thank you - I had a hunch reciprocals might be the way to reverse popularity, but wasn't sure the 'popular()' algorithm would work with them. (Was never entirely comfortable with floating point arithmetic :-)

      Well yeah, me neither: you know where you are with integers. :)

      If it reaches the stage that the maths becomes important, you would certainly want to take a lot more care. One way to do that would be to avoid the simplistic "walk a list of (floating-point) weights" algorithm, and instead work to make each reciprocal exact: for each of the options in turn, give it an exact 1/votes chance to get chosen, and if it fails proceed to the next. On reaching the end of the list without a choice, start again (which doesn't necessarily have to be in the same order).

      Getting an exact "1/votes" chance requires understanding a specific invariant of the RNG, the number of bits that it generates - exposed by perl via use Config; print $Config{randbits}. When something has 3 votes, for example, an RNG that generates 16 bits will slightly favour the value 0 (mod 3) by the ratio 21846:21845:21845. So for each option you would calculate $valid_lim = $votes * int(2**RNGbits / $votes) == 65535 in advance, and then re-roll any random value while ($rand = int(rand(2**$RNGbits))) >= $valid_lim before doing the modulus check choose_me() if ($rand % $votes) == 0, to remove that bias.

      That means doing more work (and using up more of the available entropy), so you won't want to do this unless the vote requires this sort of critical accuracy.

        Point taken! However, given that I'm looking for ways to control the 'silliness' of a dissociated-press algorithm, critical accuracy isn't necessary. :-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11142439]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (3)
As of 2024-04-20 15:39 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found