Re: Randomly choosing from a set of alternatives with varying popularity

> To favor the popular, you choose a random number N from 0 to 549, then traverse the set of animals, subtracting each one's popularity from N until N goes negative, and that's the animal you choose.

I don't think that's a good idea. More than 80% of your random numbers will be bigger than 100, which means none will be picked.

This also explains why you have problems inverting this "solution".

Furthermore a linear search in the hash won't be truly random, because the hash order is fixed within the duration of a run. This will result in a disproportionate representation of the early entries.

Solution is easy:

sort the N entries and chose a modified random function which favors one end of the range, like $i = int $N * rand() ** $e and $j = $N - 1 - $i for index in your sorted list. The way you chose $e > 1 will decide about the degree of bias ( $e=1 means no bias)

Just test the distribution of the indices:

  DB<41> %h=();$e=3; $N=10; $h{ int $N * rand() **$e }++ for 1..10000

  DB<42> x \%h
0  HASH(0x2f25fe8)
   0 => 4670
   1 => 1215
   2 => 838
   3 => 686
   4 => 550
   5 => 484
   6 => 407
   7 => 388
   8 => 426
   9 => 336
  DB<43>
[download]

In this example the first 10% are more than 10 times more likely to be picked than the last 10%.

Of course YMMV about the distribution you prefer, but the basic idea is the same.

Cheers Rolf
_{(addicted to the Perl Programming Language :)

Wikisyntax for the Monastery}

Comment on Re: Randomly choosing from a set of alternatives with varying popularity Select or Download Code

Replies are listed 'Best First'.
Re^2: Randomly choosing from a set of alternatives with varying popularity by hv (Prior) on Mar 28, 2022 at 03:35 UTC
I suspect you've misunderstood the 'popular' algorithm: 550 is the sum of the votes, so if (for example) the hash is traversed in order of decreasing popularity, rolling 0..99 will give alligator, 100..179 will give bear, etc., until 540..549 gives jellyfish. As far as I know this is pretty much the standard approach (up to optimizations) for picking from a set with weights.	[reply]
Re^3: Randomly choosing from a set of alternatives with varying popularity by LanX (Saint) on Mar 28, 2022 at 08:59 UTC
> rolling 0..99 will give alligator, 100..179 will give bear, etc., until 540..549 gives jellyfish. Indeed, I missed the fact that N is an accumulated sum where all votes so far are subtracted. That wasn't obvious from the description and the code wasn't concise. > As far as I know this is pretty much the standard approach I'm curious to know how my other objection is handled. The order in a Perl hash is random, but fixed during a run. So this algorithm will be biased to some results. Do you have a link to this "standard approach" discussing the distribution of picks? My guess is the list must be sorted in ascending order. Or to be more precise that the choice over ascending or descending decides over popular vs unpopular. > 0..99 will give alligator, 100..179 will give bear, etc., until 540..549 gives jellyfish. FWIW, the results you give can only be reproduced with a descending sort. > > > `{ alligator => 100, bear => 90, cat => 80, ... jellyfish => 10 }` (well almost `s/179/189/` ) NB: the OP has uncommented the `sort` in his code. > > > `# sort { ($votes{$b} <=> $votes{$a}) } #unnecessary` Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery}	[reply] [d/l] [select]
Re^4: Randomly choosing from a set of alternatives with varying popularity by LanX (Saint) on Mar 28, 2022 at 10:16 UTC
> I'm curious to know how my other objection is handled. The order in a Perl hash is random, but fixed during a run. So this algorithm will be biased to some results. OK, after a second coffee I finally got my had wrapped around it. Definition/Interpretation matters: The random pick $r between from `0..N-1` represents what the voter $r has chosen. Hence order doesn't matter as long as it is fixed, since a 100 wide interval will always have the same probability of 100/550, no matter where it occurs.² But I don't see any intuitive way to invert this interpretation, because $r has not-voted to all other choices. Assigning "inverted" votes like suggested in your other post might work, as long as they sum up to 550 again°. But the question of the weighting factor and distribution will come up again. Cheers Rolf _{(addicted to the Perl Programming Language :) Wikisyntax for the Monastery} °) respectively N' is corrected. update ²) IOW no matter if voters John and Jim swapped their place, their probability to be chosen stays the same.	[reply] [d/l]
Re^5: Randomly choosing from a set of alternatives with varying popularity by ibm1620 (Chaplain) on Mar 28, 2022 at 13:45 UTC
Re^6: Randomly choosing from a set of alternatives with varying popularity by LanX (Saint) on Mar 28, 2022 at 16:11 UTC

update