Re: This runs WAY too slow

Welcome to Perl Monks!

If I am reading your program right, your program is running on the order of 0(N-squared) where N=$total, i.e. the cumulative grand total of all population years and then some. It definitely isn't going to scale well.

The culprit I suspect is the way you are calculating your random variations. Each of those calculations runs $total*$gener loops, where $total is the cumulative grand total for all years and then some and $gener is the largest gap in years. You have at least three routines doing this popfilea(), popnum1() and popnum2(). At least one of these mega loops appears to be called for each and every cell of your table. Since one of the table dimensions is derived from the grand total, you are going to end up in O(N^2) land. However, to be sure of the real cause, you should use a profiler, as was suggested above.

There are some additional minor efficiency issues. As these involve very small numbers of elements, I doubt they are the source of your problems. Still they are habits to watch out for:

Statements not directly involved in the loop included in the loop. For example, when you parse the CGI stream, you not only read in the key value pairs, but also reinitialize each and every variable storing form data. Better to initialize the variables after you finish parsing the stream.
Finding the maximum with reverse sort. Were $mpe, etc and $mpy, etc in arrays rather than a pile of variables, datafilla() could initialize them with an array. While it was doing that it could look for the maximum value. That way, you could get the maximum value in a single loop without any sorting needed. But even if you stuck with all those variables, reverse sort is a very slow way to get the maximum. It involves a minimum of two loops (one to sort and one to reverse). Much better is to loop through all the values and update an $iMax variable only when you find a bigger number. Long and short you are doing at least 4 loops where you could be doing 1. However, this comment is more for future reference.

As for reducing the complexity of your functions: move all those $fooN variables into arrays, e.g. @mpe, @mpy, @grand etc. Then you can do a simply loop to set things up instead of all those repetitive almost alike lines.

Finally, this code was very hard to read largely because it was hard to tell which variables went with which processing steps. The following tips might remedy this:

Move all those variables into arrays. This will reduce the number of variables that readers mentally need to keep track of. It also lets you clarify your logic by using loops instead of repetitive code. Repetitive code is easy to write, but very hard to read. The reader can't be sure that all those lines really are variations on the same without studying each and every line carefully. Not fun.
Encapsulate the CGI parsing into its own function. Several of your variables are global in scope, but they are only used by the CGI stream parsing phase, e.g. %FORM, @pairs, $buffer, $name, $value are all declared as package globals even though they are never used after the form is read in.
And in general try to keep the variables close to where they are used. Even if a variable is global it is still a good idea to pass in any data used by a function as parameters and to return any generated data as return values. That way the code is self-documenting about what inputs and what outputs go with each function. To do this well, you will need to familiarize yourself with array and hash references.

Something like this:

   # $hFORM is a hash reference

   my $hFORM = parseCGIStream(\*STDIN);

   # input is stored as key-value pairs in $hFORM
   # output is all these variables

   my ($aMpe, $aMpy, $aGrand, $aPopbyyer
      , $total, $gener) = datafilla($hFORM);

   # input is $gener, $total
   # output is $aOa  (array reference storing your @aoa)

   my $aOa = popfilea($gener, $total);

   #... and so on ...
[download]

Comment on Re: This runs WAY too slow Select or Download Code