narse has asked for the wisdom of the Perl Monks concerning the following question:

I need a method for randomly generating histograms. That is to say, I need to fill an array with number values, 0-100, and have the sum of all array elements = 100. I am wondering if there are any known methods for this sort of work or if someone can help me come up with a creative solution. Some error is alright, so if the total was 101 or 99 due to rounding error of sorts, that would not be a problem. This question is a slight simplification of what I am actually doing so I might not be able to use any modules designed for this, but I may be able to grok the code and alter it for my project if anyone has suggestions. Thanks in advance.

Replies are listed 'Best First'.
•Re: Randomly generating histograms
by merlyn (Sage) on Mar 25, 2002 at 15:47 UTC
    That's easy. Generate as many as you want, all from 0 to 1. Then take their sum, and go back and multiply each element by 100 divided by that sum!
    my @data; push @data, rand 1 for 1..30; # for 30 numbers my $sum; $sum += $_ for @data; # calculate sum $sum = 100 / $sum; # ratio to ensure they sum to 100 $_ *= $sum for @data; # then scale them all

    -- Randal L. Schwartz, Perl hacker

Re: Randomly generating histograms
by derby (Abbot) on Mar 25, 2002 at 15:37 UTC
    narse,

    Well, there may be other things that bound you (this example just uses integers and can have dupes) but a simplistic brute force approach would be:

    #!/usr/bin/perl -wd use strict; my $tot = 0; my $i = 0; my $x = 0; my @array; while( $tot < 100 ) { my $x = int rand( 100 ); $x = ( $tot + $x > 100 ) ? 100 - $tot : $x; $array[$i] = $x; $i++; $tot += $x; } $tot = 0; foreach( @array ) { print $_, "\n"; $tot += $_; } print "Total is ", $tot, "\n";

    But I'm sure there are prettier approaches. But start with rand and perlfaq4 for ideas.

    <code> -derby

      This is similar to the method I am using at the moment for testing. The problem I have is the first entries in the array always have the largest numbers, while the last ones are nearly always 0. I mixed this up some by randomizing which array index each value goes into but I'm hoping to find something thats a little more random. Thanks tho.
        This is completely hackish but loop for x number of times (12, 100?) before using the values of rand:

        $x = int rand( 100 ) for ( 1 .. 12 ); $x = 0;

        update: Or better yet. After you have the array filled, shuffle it as described in perlfaq4. -derby

      I probably didn't express what I was hoping to do well enough so not many people hit on what I was asking. Thanks for the responses none the less. I found a method elsewhere that does what I want very well and is easier to impliment than the others that I saw. Here it is:
      #!/usr/bin/perl -w use strict; my @arr; map { $arr[rand 10]++ } 0..99; print join(' ', @arr), "\n";
Re: Randomly generating histograms
by particle (Vicar) on Mar 25, 2002 at 15:53 UTC
    think about the probability distribution you want. some examples are:

    * continuously and uniformly distributed (equal distribution over the range)
    * continuously and normally distributed (like a bell curve)
    * discretely and nonuniformly distributed (like loaded dice)

    once you've figured out what you want your data to look like, you can create it using the appropriate function, and tally the results as you go, stopping when you approach your max (within some acceptable error value.)

    give it a try, and post your code if you run into trouble.

    ~Particle ;̃

      Sorry, I should have specified what shape distribution. Ideally each entry would have an equal randomness, so a uniform distribution would be the goal. Unfortunately I'm not much for mathematical logic.
Re: Randomly generating histograms
by Jasper (Chaplain) on Mar 25, 2002 at 18:57 UTC
    Randal's already answered this, so you probably aren't reading further, but here's my go
    my $sum = 100; my $elements = 30; my @array; for (1..$sum) { ++$array[ int rand $elements ]; } $array[$_] ||= 0 for @array;
    so you can fiddle with the sum, and the number of elements you want at your leisure.

    Jasper
      i'd use map for a one liner to replace for:
      map ++$array[rand $elements], 1..$sum;
      also, you can drop the int, like i have. and you mean
      $array[$_] ||= 0 for 0..$elements-1;
      don't you?

      ~Particle ;̃

        i'd use map for a one liner to replace for:
        map ++$array[rand $elements], 1..$sum;
        I wouldn't. This is clearer, faster, and one character shorter:
        ++$array[rand $elements] for 1..$sum;
        Help stamp out void maps and greps!

        -- Randal L. Schwartz, Perl hacker

Re: Randomly generating histograms
by talexb (Chancellor) on Mar 25, 2002 at 15:18 UTC
    I seem to be repeating myself this morning. What have you tried? What didn't work?

    --t. alex

    "Here's the chocolates, and here's the flowers. Now how 'bout it, widder hen, will ya marry me?" --Foghorn Leghorn