Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

This question isn't strictly about Perl, but the project I'm working on is in Perl, so I figure it's appropriate.
I've been given a function that returns random numbers distributed normally around a given mean in the range (mean - delta, mean + delta). I'm supposed to write a second function that will take the values returned by that function and skew the distribution towards some value other than the mean. How do I do this? Unfortunately, I can't just write another RNG that would produce values with the skewed distribution. I've found a couple of ways to do that on the 'net, but the people who put ink on my paychecks say I don't get to. Many thanks!

~Q

  • Comment on Skew normally distributed random numbers

Replies are listed 'Best First'.
Re: Skew normally distributed random numbers
by bart (Canon) on Jan 27, 2005 at 21:40 UTC
    What you have to use, in general, is the inverse function of the cumulative distribution of the probability you want to achieve. Start reading right under formula 9.

    In less mathematical terms (but only slightly): Assume the chance of x falling between xi and xi+dx is as close to P(xi)*dx as you can get, with dx very small. Then yi = D(xi), with formula 10 connection the two functions — meaning P(x)*dx ≈ D(x+dx)-D(x) for all x, is the chance that x < xi. This chance is between 0 (for xi = -inf) to 1 (for xi = +inf).

    Note that Perl's rand function returns an approximately uniform distribution between 0 and 1. Let's say you pick one of these random numbers, use them as the current value for yi, and via the inverse function of D(x) get back a value for xi. Well: those values for x that you get back this way, are distributed with probability function P(x). Et voilą.

    So all you need is sub implementing approximately the inverse function for D(x).

Re: Skew normally distributed random numbers (UPDATED!)
by BrowserUk (Patriarch) on Jan 27, 2005 at 22:04 UTC

    Update2: I don't think this is exactly how I did it before, but it does appear to work.

    #! perl -slw use strict; use List::Util qw[ sum min max ]; our $N ||= 10_000; our $MEAN ||= 75; sub skewedRnd { my( $start, $end, $skewedMean ) = @_; my $low = $skewedMean - $start; my $high = $end - $skewedMean; return ( rand() < 1 - ( $skewedMean / ( $end - $start ) ) ) ? $start + rand( $low ) : $skewedMean + rand( $high ); } my @values = map skewedRnd( 0, 100, $MEAN ), 1 .. $N; printf "range 0 .. 100; Min: %f Max:%f Mean:%f \n", min( @values ), max( @values ), sum( @values ) / $N; __END__ [23:52:06.41] P:\test>425761 -N=1000000 -MEAN=99 range 0 .. 100; Min: 0.003021 Max:99.999969 Mean:98.999126 [23:52:28.58] P:\test>425761 -N=1000 -MEAN=99 range 0 .. 100; Min: 5.154236 Max:99.997009 Mean:98.727916 [23:52:41.92] P:\test>425761 -N=1000 -MEAN=20 range 0 .. 100; Min: 0.029907 Max:99.995117 Mean:19.924415 [23:52:51.27] P:\test>425761 -N=1000 -MEAN=10 range 0 .. 100; Min: 0.008850 Max:97.561035 Mean:10.563889 [23:52:55.64] P:\test>425761 -N=1000 -MEAN=75 range 0 .. 100; Min: 0.144196 Max:99.998474 Mean:74.406415 [23:53:01.49] P:\test>425761 -N=1000 -MEAN=90 range 0 .. 100; Min: 1.606750 Max:99.976501 Mean:90.610480 [23:53:08.42] P:\test>425761 -N=1000 -MEAN=51 range 0 .. 100; Min: 0.049805 Max:99.974579 Mean:50.391814 [23:53:33.88] P:\test>425761 -N=1000 -MEAN=51 range 0 .. 100; Min: 0.221008 Max:99.923737 Mean:50.118810 [23:53:36.24] P:\test>425761 -N=10000 -MEAN=51 range 0 .. 100; Min: 0.001556 Max:99.998505 Mean:51.276380 [23:53:40.10] P:\test>425761 -N=100000 -MEAN=51 range 0 .. 100; Min: 0.000000 Max:99.997009 Mean:51.018693 [23:53:44.44] P:\test>425761 -N=1000000 -MEAN=51 range 0 .. 100; Min: 0.000000 Max:99.998505 Mean:50.992506

    UPDATE: Below here is wrong--

    I'll let the math guys pick it apart, and offer it on the basis that it is simple, understandable and served my purpose. Maybe it will serve yours too.


    Examine what is said, not who speaks.
    Silence betokens consent.
    Love the truth but pardon error.
Re: Skew normally distributed random numbers
by xorl (Deacon) on Jan 27, 2005 at 21:10 UTC
    Just use the same function again only send it the new values. IMHO this sounds more like a home work problem than a real life problem. Either that or the people who put ink on your paychecks need to get out of the way and let you do your job.
Re: Skew normally distributed random numbers
by Roy Johnson (Monsignor) on Jan 27, 2005 at 21:13 UTC
    Average the generated number with the new value. That'll skew it in the right direction.

    This really isn't a Perl problem, and it's not a really well-defined statistics/math problem, either.


    Caution: Contents may have been coded under pressure.
Re: Skew normally distributed random numbers
by TedPride (Priest) on Jan 27, 2005 at 21:26 UTC
    Skew how? Do you just want the new value averaged with each point, or is it more like a gravity simulation, where closer dots are pulled more?
Re: Skew normally distributed random numbers
by fglock (Vicar) on Jan 28, 2005 at 11:39 UTC

    If I understand the question, you want to change the mean value of a random function to a given value, right? This is a linear transformation, and it is very simple to implement:

    $val = old_random_function() - $old_avg + $new_avg;

    or, create a new function:

    sub new_random_function { old_random_function() - $old_avg + $new_avg }

      The idea is to produce random numbers within a specified range, but with a skewed average. What you describing wil not only transform the average, but also the range.

      Eg. random numbers in the 0 .. 100 will, over time, tend towards an average of 50. If the desired average was say 75, then your math will produce a set of values in the range:

      ( 0 - 50 + 75 ) .. ( 100 - 50 + 75 ) := 25 .. 125

      Which will tend towards the desired average, but is the wrong range. (I did a similar thing with my first attempt above).

      The transformation required is not linear, but statistical.


      Examine what is said, not who speaks.
      Silence betokens consent.
      Love the truth but pardon error.

        Actually, the AM says they want random numbers distributed normally, and that the range is specified as (mean - delta, mean + delta).

        I believe they want to change the mean, and keep the delta.