If performance is an issue then you may consider caching the code which generates the numbers.

In this entry I have set up a $funcs scalar to hold some code refs depending on how the function was called ie. the weights.

The function flattens the elements sent in into a text string then uses that as a key value to save the code ref under. Then it calculates some values which normalize the input probabilities into an array which will have about 100 elements. Each element of the array stores the desired output value. We then build an anonymous function which returns a random output values based on the real number of items in the array. We precompute and store the number into a scalar before building the function so we dont have to get the size of the array every time we make up a random value.

This function would work for N different weightings used in the same program since it would make up a function to do each distinct weighting and save it off for later use.

This could be extended into a constructor such that you pass in the relative weights and the constructor returns the code ref directly to the caller so that it is no longer necessary to even pass the weighting parameters on each call. In the sample program given this caching is a huge win since except for the first time we do a code lookup and then reference into a fixed length array by a random subscript each time a number is desired. If the sub passed the code ref out directly then this could be at least twice as fast since the key construction to lookup the code probably takes as long as getting the random number from the function.

Caveats: If you have large number of outcomes then 100 will not be enough slots. Changing the numerator on the $scale= line will add more slots and use more space accordingly. Also I am taking the weightings by value from @_ where the original poster was passing a reference. Be sure to include the block which encloses the definition of funcs and weighted_rand together to have presistance of your generated functions.

Have Fun

{ my $funcs; sub weighted_rand { my @weight=@_; my $key=join(',', @weight); return &{$funcs->{$key}} if($funcs->{$key}); # use existing funct +ion my $sum; map { $sum += $_ } @weight; my $scale=100/$sum; my @outcome; my $outcome=0; foreach my $w ( @weight ) { my $cnt=sprintf "%.0f", $w*$scale; # get a rounded number of sl +ots push @outcome, (split '', $outcome x $cnt); $outcome++; } my $outcount=@outcome; # if the rounding gets us 99 or 101 this w +ill adjust the function $funcs->{$key} = sub { $outcome[rand($outcount)] }; return &{$funcs->{$key}}; } }

In reply to Re: Weighted random numbers generator by dga
in thread Weighted random numbers generator by spurperl

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.