my @constraints = ( {mid => 20, sd => 15}, {mid => 30, sd => 25}, {mid => 50, sd => 10}, );

As others have implied, I don't think you're going to be able to do this deterministically, as you have over-specified your solution set. The sum of your maximums is greater than 100 and the sum of your minimums is less than 100.

What comes to mind to minimize the number of discarded sets and avoid scaling is to pick the numbers in descending order of midpoint. After generating each number, check if the remainder is more than the sum of the minimums remaining and less than the sum of the maximums remaining. If it's outside the bounds, repick the last number. When you're down to the last number, you don't generate it randomly, it's just the remainder.

Basically, you're pruning off choices that can't satisfy the remaining constraints.

Without thinking it through further, I'm worried that doing it in descending order of midpoint might bias the results. I think you could pick them in random order and you're just more likely to have to repick numbers along the way.

Update: Here's a code sample:

use strict; use warnings; sub RandFlat { #Return a rand () value with a flat distribution about the $mean + +- $stdDev my ($mean, $stdDev) = @_; my $range = 2.0 * $stdDev; my $value = rand ($range); return $value + ($mean-$stdDev); } sub generate_set { my ($remainder, @constraints) = @_; my @results; my ($sum_of_minima, $sum_of_maxima); for my $c ( @constraints ) { $sum_of_minima += $c->{mid} - $c->{sd}; $sum_of_maxima += $c->{mid} + $c->{sd}; } # iterate through N-1 constraints in descending order my @descending = sort { $b->{mid} <=> $a->{mid} } @constraints; my $last_value = pop @descending; for my $c ( @descending ) { my $n = RandFlat( $c->{mid}, $c->{sd} ); # repeat if remainder outside sum of the remaining # minima and maxima contraints my $new_remainder = $remainder - $n; my $new_sum_of_minima = $sum_of_minima - ( $c->{mid} - $c->{sd +} ); my $new_sum_of_maxima = $sum_of_maxima - ( $c->{mid} + $c->{sd +} ); redo if ( $new_remainder < $new_sum_of_minima) || ( $new_remainder > $new_sum_of_maxima); # otherwise save number and update the remainder and constrain +ts push @results, [ $c->{mid}, $c->{sd}, $n ]; $remainder = $new_remainder; $sum_of_minima = $new_sum_of_minima; $sum_of_maxima = $new_sum_of_maxima; } # the remainder must now satisfy the final constraint return @results, [ $last_value->{mid}, $last_value->{sd}, $remaind +er ]; } my $total = 100; my @constraints = ( {mid => 20, sd => 15}, {mid => 30, sd => 25}, {mid => 50, sd => 10}, ); for ( 1 .. 5 ) { for my $result ( generate_set($total, @constraints) ) { my ($mid, $sd, $value) = @$result; printf "%5.1f +-%5.1f: %5.1f", $mid, $sd, $value; print " - bad" if ($value < ($mid - $sd)) || ($value > ($mid + + $sd)); print "\n"; } print "\n"; }

I ran it 1000 times and didn't see any bad results.

Also, thinking it through again, I don't think the descending order will wind up biased (but I could be convinced by a good argument).

-xdg

Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.


In reply to Re: Need technique for generating constrained random data sets by xdg
in thread Need technique for generating constrained random data sets by GrandFather

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.