in reply to A script with a loop is running my computer Out of memory

The history of mathematics and (more recently) computer science has been moved along with respect to innovation with almost a singularly common goal; finding ways to calculate things that require less work. Your calculations are simple, but the work you're doing is hard, and significant. It would be better to look again at the model, and see if there's a mathematical solution (ie, a better algorithm) to accomplish the same thing with less work.

Your current solution is going O(c^n), ie, exponential. I suspect that there is an O(1) solution to the question of population growth rates. What that means is your current solution requires an exponential amount of work per additional generation, while there probably exists a non-iterative solution instead.

I recall we talked in CB about this a little. I may have been the person who mentioned forking, but it was in jest. My point was "If you think this solution crashes your computer, you ought to see what it does if the children are forks." (Which is akin to fork-bombing). That was intended as a muse, not as a suggestion. The concept behind a fork bomb is that each child spawns a new process, which in turn spawns a few new processes, and so on until the operating system and hardware simply can't accommodate them all. But that situation is actually similar to what's happening here, just without forks; You're spawning so many values, and holding onto so many of them, that the system cannot accommodate them all. And to what goal? To solve a problem that can be solved with a mathematical equation that doesn't require repetition.

I also briefly touched on the concept of Big O notation. You mentioned that is an unfamiliar topic. So we might as well discuss it a little here. Think of "Big O notation" as a measure of resources consumed. The resource may be computational work (time), complexity, or memory, for starters. Your code is fairly intensive of both time and memory. But we'll focus on how much work is being done.

I'm going to briefly discuss Big-O, and in terms that are oversimplified. Common Big-O notation is written in the format of O(x), where 'x' is a formula that describes how much work is being done in terms of 'n' (n being number of items being worked on). O(n) means the amount of additional work per additional item is equal as the number of items grows.

Other common units include:

So if you look at your program, each new generation requires exponentially more work than the previous generation. This is a pretty 'bad situation' to be in from a programming standpoint. You have to start saying to yourself, can this be reduced to a statistical model that is characterized by a simple formula? If that's the case, your work drops to O(1).


Dave

  • Comment on Re: A script with a loop is running my computer Out of memory

Replies are listed 'Best First'.
Re^2: A script with a loop is running my computer Out of memory
by moritz (Cardinal) on Feb 02, 2011 at 21:33 UTC

    There is indeed an analytic approach.

    If a generation contains $n people, then what the script does is adding $n random variables. Since all 6 possible values are equally likely, the resulting sum is a Binomial distribution.

    For big $n (let's say $n >= 20) the Binomial distribution can be very well approximated by a normal distribution.

    Normally distributed random variables can be easily and efficiently implemented with the Box–Muller transform, which turns two uniformly distributed random numbers (what rand returns) into two normally distributed random numbers.

    If this were a project of mine (and I needed an efficient approach), I'd follow the approach of individual random numbers per population item as long as the population is less than 20, and for higher numbers I'd use the normal distribution approximation.

    If there's interest, I can also show how the mean and variance of the random variable is calculated, but I'm tired right now and I'd do it tomorrow :-)

    Update: mean and variance isn't as complicated as I thought.

    The mean is just (0 + 1 + 2 + 3 + 4 + 5) / 6 = 2.5, and the variance is 1/6 * ((0-2.5)**2 + (1 - 2.5)**2 + (2-2.5)**2 + (3-2.5)**2 + (4-2.5)**2 + (5-2.5)**2) = 35/12 = 2.91666666666667. With these parameters you can just use the formulas in the various Wikipedia entries.

Re^2: A script with a loop is running my computer Out of memory
by bart (Canon) on Feb 07, 2011 at 22:26 UTC
    • O(log n) => work increases logarithmically as n grows. (ie, work increases, but at a rate slower than the growth of n.)
    That's a pretty vague description of log(n)...

    An exact description of log(n) is: for every multiplication of n with a constant factor (like doubling, or 10-fold increase), a fixed amount of extra work gets added.

      That is better. We might also change "a fixed amount of extra work gets added" to "a single unit of extra work gets added." Even that is probably more vague than just coming to an understanding of what logarithmic graphs look like as 'n' increases. Unfortunately, that's sort of difficult to post here. I do like trying to improve the description though. Wikipedia has a great writeup on Big-O notation. Why didn't we have that when I was in college? ;)


      Dave