in reply to still memory leak

Your program is not leaking memory, it is consuming it! By design. Your design.

You've added -w, but you still haven't used strict, and that is concealing one of the reasons you are using a lot of memory.

At least way more than you need to.

@haplotypes = (); ## THIS IS AN ARRAY DECLARATION ..... push @{ $haplotypes{ $currentPop} }, [ @temp ]; ## THIS ...........^............^...................is a hash ## with a numeric key, that increments from zero, by 1!

Assuming your figure of "hundreds of thousands of repeats" means 100,000, then simply changing

$haplotypes{ $currentPop} to $haplotypes[ $currentPop ]

in five places in your code will save you 11 MB. If it means 500,000 thousand, then 55 MB saved.

This:

@temp = split; push @{$haplotypes{$currentPop}}, [@temp];

would be better written as

push @{ $haplotypes[ $currentPop ] }, [ split ];

And may save some memory through reduced fragmentation? And this:

@{$TempList{$currentPop};} = (); while (@{$haplotypes{$currentPop};}) { push @{$TempList{$currentPop};}, splice(@{$haplotypes{$current +Pop};}, rand @{$haplotypes{$currentPop};}, 1); } @{$haplotypes{$currentPop};} = @{$TempList{$currentPop};};

Re-written as:

use List::Util qw[ shuffle ]; ## near the top of your program. ... @{ $haplotypes{$currentPop } } = shuffle @{ $haplotypes{$currentPo +p } };

may save some more for the same reason. Either way, it's clearer.

And finally , you are carefully squirreling away the $current population in @total

push @total, $currentPop;

Which you then use to index your $sampleSize hash. But $currentPop is simply a counter that gets incremented from 0.

So:foreach $_ ( @total ) {

Is identical to: foreach $_ ( 0 .. $currentPop ) {

except that it doesn't create a 10 MB array in the process.

But this gets worse.

In your final for loop, you use $_ to index into $sampleSize{ $_ }.

And %samplesize is a hash (despite the fact that you declare a hashref with the same name at the top of your program), that is index by $currentPop.

Now The value of $currrentPop that is being used as the key in %sampeSize will not be contiguous, making %sampleSize a "sparse array" and a hash a good choice. But..

You are pushing sequencial integer values ($currentPop) into an array (@total) which you the use to iterate the sparse (non-sequential) keys of %sampleSize? Which means that one in 4 (or more or less depending on how representative your sample data is) of the iterations of the inner for loop that is doing the printing, the controlling maximul value

for($i = 0; $i < $sampleSize{$_}/2; $i++) {

Is going to be an autovivified undef divided by 2! Which with -w in force (assuming you didn't just tack that in for posting) should be causing reams of

Use of uninitialized value in division (/)...
errors to be printed.

Which leads me to believe that you not telling the whole story and disinclines me to further analysis or help.

Try making your code compile and run under warnings and strict and then come back and you might get some more help.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

Replies are listed 'Best First'.
Re^2: still memory leak
by Juba (Initiate) on Aug 05, 2004 at 11:17 UTC
    Well, thanks for all the tips. I have already compiled the code and it works for say 50000 repeats without any error message. I am really telling the whole story. I'll try to incorporate all your suggestions. Thanks again