The original "gen" was substantially faster than both.

I mentioned that gen3 would be slower, but (I thought) fairer. This turns out not to be the case :(

The basic algorithm is to allocated the minimum possible amount to each of the values. Subtract that from 100 to get ($remainder). Then allocate random amounts of that remainder to a random recipient until it's all gone. Skipping and trying again if the random allocated amount is greater than the randomly chosen recipient can accept.

Whilst not actually determanistic--there is a random element after all :)--it converges to 0 reliably and very quickly always producing a correct solution. And all solutions are possible.

The problem with the algorithm is that it is not fair. It favours some solutions over others.

This table (split into 3), represents the results generated from a run producing 1e6 solutions to the OP problem.

The columns represent the last variable (40-60 range). The rows are the first variable (5 - 35 range). Within the table are two values: the corresponding value for the third variable (5-55 range) on the left and the frequency that the 3-value combination was chosen.

|40 |41 |42 |43 |44 |45 |46 --+-------+-------+-------+-------+-------+-------+------- 35|25 1388|24 1106|23 1153|22 1259|21 1441|20 1632|19 1726 34|26 1067|25 734|24 811|23 903|22 1021|21 1214|20 1384 33|27 978|26 598|25 690|24 785|23 901|22 1057|21 1201 32|28 964|27 618|26 663|25 753|24 913|23 1062|22 1176 31|29 891|28 588|27 609|26 676|25 794|24 1011|23 1123 30|30 865|29 531|28 579|27 667|26 798|25 934|24 1180 29|31 847|30 497|29 518|28 665|27 743|26 967|25 1054 28|32 790|31 506|30 511|29 614|28 745|27 866|26 1021 27|33 763|32 475|31 568|30 615|29 743|28 850|27 1031 26|34 691|33 470|32 514|31 611|30 724|29 847|28 971 25|35 735|34 452|33 468|32 585|31 683|30 874|29 962 24|36 709|35 416|34 485|33 590|32 686|31 836|30 999 23|37 616|36 403|35 430|34 526|33 667|32 778|31 953 22|38 581|37 340|36 379|35 477|34 588|33 735|32 931 21|39 500|38 335|37 400|36 447|35 572|34 672|33 867 20|40 467|39 306|38 340|37 457|36 536|35 640|34 765 19|41 412|40 268|39 280|38 354|37 443|36 582|35 663 18|42 392|41 223|40 258|39 324|38 403|37 476|36 636 17|43 335|42 231|41 216|40 304|39 340|38 485|37 558 16|44 279|43 173|42 213|41 297|40 329|39 413|38 499 15|45 315|44 185|43 209|42 258|41 314|40 437|39 521 14|46 222|45 125|44 175|43 210|42 254|41 312|40 407 13|47 186|46 104|45 125|44 179|43 208|42 298|41 320 12|48 130|47 76|46 92|45 129|44 153|43 206|42 257 11|49 99|48 63|47 80|46 98|45 120|44 139|43 186 10|50 64|49 57|48 59|47 64|46 98|45 108|44 125 9|51 50|50 36|49 47|48 55|47 77|46 88|45 129 8|52 55|51 27|50 38|49 67|48 68|47 79|46 106 7|53 47|52 25|51 25|50 38|49 44|48 70|47 81 6|54 39|53 24|52 23|51 38|50 31|49 69|48 80 5|55 52|54 38|53 36|52 40|51 65|50 77|49 130 |47 |48 |49 |50 | 51 |52 |53 --+-------+-------+-------+-------+-------+-------+------- 35|18 1986|17 2270|16 2448|15 2983|14 2530|13 2305|12 2140 34|19 1478|18 1668|17 1842|16 2025|15 2056|14 1908|13 1870 33|20 1366|19 1512|18 1682|17 1832|16 1820|15 1984|14 1915 32|21 1316|20 1531|19 1730|18 1821|17 1721|16 1681|15 2021 31|22 1376|21 1490|20 1658|19 1818|18 1700|17 1800|16 1870 30|23 1312|22 1535|21 1749|20 1824|19 1767|18 1808|17 1887 29|24 1229|23 1548|22 1657|21 1914|20 1809|19 1845|18 2009 28|25 1209|24 1431|23 1730|22 1917|21 1815|20 1937|19 1954 27|26 1207|25 1448|24 1750|23 2051|22 1807|21 1970|20 2104 26|27 1175|26 1409|25 1707|24 2047|23 1864|22 2035|21 2148 25|28 1242|27 1385|26 1702|25 2048|24 1944|23 2179|22 2237 24|29 1200|28 1343|27 1665|26 1953|25 1935|24 2100|23 2379 23|30 1163|29 1407|28 1676|27 1890|26 1836|25 2116|24 2343 22|31 1126|30 1270|29 1565|28 1851|27 1815|26 2019|25 2339 21|32 1007|31 1292|30 1507|29 1810|28 1679|27 2006|26 2100 20|33 927|32 1157|31 1425|30 1710|29 1622|28 1877|27 1989 19|34 885|33 973|32 1259|31 1550|30 1524|29 1752|28 2002 18|35 741|34 952|33 1163|32 1426|31 1461|30 1610|29 1786 17|36 722|35 878|34 1052|33 1280|32 1387|31 1450|30 1596 16|37 627|36 783|35 980|34 1172|33 1202|32 1312|31 1542 15|38 598|37 787|36 959|35 1203|34 1226|33 1401|32 1643 14|39 509|38 632|37 834|36 999|35 1000|34 1222|33 1337 13|40 442|39 560|38 648|37 784|36 787|35 921|34 1163 12|41 320|40 399|39 501|38 665|37 597|36 728|35 928 11|42 243|41 365|40 409|39 540|38 540|37 601|36 765 10|43 176|42 254|41 320|40 403|39 409|38 458|37 587 9|44 179|43 192|42 270|41 316|40 312|39 408|38 449 8|45 122|44 160|43 209|42 267|41 239|40 321|39 341 7|46 103|45 115|44 160|43 223|42 198|41 268|40 309 6|47 99|46 108|45 125|44 183|43 181|42 229|41 291 5|48 121|47 168|46 254|45 279|44 311|43 355|42 419 |54 |55 |56 |57 |58 |59 |60 --+-------+-------+-------+-------+-------+-------+------- 35|11 2090|10 2029| 9 2069| 8 1841| 7 1980| 6 2147| 5 3417 34|12 1789|11 1705|10 1584| 9 1560| 8 1646| 7 1790| 6 2455 33|13 1855|12 1712|11 1671|10 1737| 9 1742| 8 1841| 7 2437 32|14 1988|13 1969|12 1874|11 1964|10 1924| 9 2123| 8 2773 31|15 2259|14 2159|13 2160|12 2117|11 2231|10 2354| 9 3236 30|16 2051|15 2514|14 2414|13 2437|12 2578|11 2794|10 3621 29|17 2158|16 2417|15 2834|14 2782|13 2793|12 3173|11 4410 28|18 2182|17 2378|16 2613|15 3274|14 3446|13 3719|12 5159 27|19 2236|18 2357|17 2662|16 3145|15 3728|14 4315|13 5951 26|20 2309|19 2608|18 2787|17 3257|16 3715|15 4947|14 6915 25|21 2520|20 2721|19 3060|18 3294|17 3918|16 4862|15 8045 24|22 2579|21 2896|20 3257|19 3517|18 4172|17 5088|16 8058 23|23 2749|22 2956|21 3296|20 3840|19 4333|18 5447|17 8236 22|24 2623|23 2950|22 3389|21 3942|20 4579|19 5580|18 8478 21|25 2460|24 2903|23 3312|22 3697|21 4437|20 5712|19 8578 20|26 2301|25 2785|24 3189|23 3637|22 4433|21 5524|20 8819 19|27 2210|26 2541|25 3122|24 3636|23 4537|22 5554|21 8553 18|28 2027|27 2375|26 2730|25 3306|24 4288|23 5477|22 8472 17|29 1963|28 2210|27 2726|26 3177|25 3975|24 5142|23 8239 16|30 1845|29 2213|28 2564|27 2969|26 3685|25 4958|24 7985 15|31 1904|30 2142|29 2639|28 3139|27 3925|26 5044|25 7934 14|32 1579|31 1757|30 2209|29 2650|28 3260|27 4230|26 6789 13|33 1250|32 1564|31 1833|30 2170|29 2773|28 3596|27 5750 12|34 1106|33 1243|32 1555|31 1853|30 2454|29 3094|28 4962 11|35 861|34 1026|33 1208|32 1543|31 1876|30 2565|29 4114 10|36 694|35 849|34 1030|33 1266|32 1631|31 2101|30 3474 9|37 561|36 625|35 804|34 968|33 1289|32 1802|31 2889 8|38 455|37 516|36 689|35 792|34 1032|33 1400|32 2416 7|39 310|38 448|37 554|36 629|35 845|34 1190|33 2073 6|40 328|39 402|38 445|37 596|36 794|35 1003|34 1755 5|41 475|40 599|39 703|38 918|37 1180|36 1447|35 2061 651 selections were made

As you can see, all 651 possible solutions were chosen, but their frequencies vary from as low as 23 to a high of 8819. Ideally this would be plotted as a 3D scatter plot to allow it to be visuallised correctly, but I don't have anything that will allow me to do that easily. And I couldn't post the results if I did :(

The reason the results are skewed is that when the initial random allocation of the remainder are made, the range of random values that can be chosen is larger than the residual range of any of the values, so the largest residual will be favoured.

The change for gen3() is to constrain the size of the random allocations from the remainder at each iteration to a size smaller than the smallest residual after their minimum allocations. This means that algorithm interates more times, but at each stage, it is more likely that whatever value is randomly chosen will be able to accept the allocation--until you reach the point where that value has reached it's upper bound.

Unfortunately, a bug in my statistics gathering routine meant that I saw what I was hoping to see. When I correct that bug, it turns out that this slower gen3() algorithm does little better than gen().

In summary, gen3() produces nearly identical (unfair) results to gen(), so if the fairness is not a problem, gen() is the better choice.

If fairness is important, my other post above may be worth considering. By producing all solutions and the picking one at random, you ensure fairness, but the price is that it gets very slow as the number of values and their ranges increase. I keep thinking, and have been pursuing a solution that generates all the possible solutions without iterating all the non-possibles and discarding them.

I am really quite sure that this is possible for a 3-value problem. And I think it should be extensible to N-values, but I've always had trouble visualising things in greater than 3 dimensions and if I cannot visualise it, I have great trouble coding it.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

In reply to Re^3: Need technique for generating constrained random data sets by BrowserUk
in thread Need technique for generating constrained random data sets by GrandFather

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.