in reply to Renovating Best Nodes
As things stand, most times this is run, we don't have a single node in the top 20. More than a quarter of the time we won't get anything in the top 50.
Of course getting a broad selection is good as well.
The following snippet shows how you can balance the two fairly flexibly:
The resulting distribution has the following properties (back of the envelope calculation):# I'm assuming that sth returns a long list of nodes ordered # from lowest rep to highest and then newest to oldest. my @selected; for (1..50) { push @selected, $sth->fetchrow_hashref(); } while (my $row = $sth->fetchrow_hashref()) { if (rand(1) < 0.1) { $selected[rand(@selected)] = $row; } }
UPDATE: Here is a changed code sample that does the same as the above, only it reads from the highest reputation node to the lowest because I've been told that this is better. (A fact that complicates it, but oh well.)
This does the same thing as the snippet above except that I am filling in "nothing got chosen by chance" with top nodes rather than bottom nodes. If you fetch 4000 nodes, then you will only fill in from the filler 1.68% of the time. Alternately you can change the 0.1 to 0.2, and leave the number of nodes that you fetch at 2000. Or you can fetch 2000, leave the parameter at 0.1, and say that people don't mind seeing an extra one of the top 55 nodes or so get spotlighted 60% of the time.# I'm assuming that sth returns a long list of nodes ordered # from highest rep to lowest and then oldest to newest. my @selected; my @filler; my $limit = 50; while (my $row = $sth->fetchrow_hashref()) { if (rand(1) < 0.1) { $selected[rand($limit)] ||= $row; } elsif (@filler < $limit) { push @filler, $row; } } for (0..($#filler)) { $selected[$_] ||= $filler[$_]; }
Many other ways to tweak this exist.
|
---|