in reply to Re: cleaning up memory after closing files on MACOSX
in thread cleaning up memory after closing files on MACOSX

Here is basically where I have pinpointed the memory leak to (caution: I don't know if this code will actually work, its just a snippet from my program):
$linectr=0; $ctr2=0; for $line (@array){ $hash{$linectr}=$line; if($ctr2=100000){ addtofile(\%hash); %hash=(); $ctr2 = 0; } $ctr2++; $linectr++; } sub addtofile { my $hashref = shift; open(FH,$tempfile); foreach $value (keys %$hashref) { print FH "$$hashref{$value}:$value\n"; } close FH; }
when I close the FH, some memory gets deallocated, and the RSS size on top goes down, but not completely, so it keeps growing and growing. on Linux this doesn't happen very fast, so my program usually finishes before I run out of memroy, but on OSX the memory grows by leaps and bounds, and I run out of memory about 10-15 passes through the program.

however, I found a solution to my problem using Berkeley DB. Since I am trying to sort files that are too large to put in memory, I thought I would have to break them up and use a merge-sort algorithm, but if I tie an empty file to the Berkeley DB object I can treat the file as an array and insert lines into the middle of the file, so no need to keep opening and closing files.

janitored by ybiC: balanced <code> tags as per Monastery convetion, and a bit o'formatting

Replies are listed 'Best First'.
Re: Re: Re: cleaning up memory after closing files on MACOSX
by sgifford (Prior) on Aug 29, 2003 at 02:37 UTC

    Glad you found a solution to your problem!

    As far as the memory leak, how big is @array? Does a loop like this leak memory?

    foreach my $i (0..$#array) { open(FH,"> temp$i") or die "open: $!"; print FH $array[$i]; close FH; }

    A few other random comments:

    • Do you realize that if($ctr2=100000) will always be true, since = is the assignment operator not a comparison operator?
    • A hash is a strange data structure to store an ordered list of things, like lines in a file. An array is more appropriate.
    • Even better, rather than keeping 100000 lines in memory at a time, just keep the output file open, then print lines to it as you read them.
    • Also consider using use strict and the -w flag, and using lexical variables for your loops:
      for my $line (@array){
      
      If you are doing something you don't realize, like causing a memory leak, this will often tell you. It's also clearer that a memory leak isn't happening when it will disappear immediately after the loop.
    • Or see if you can get sort(1) to do this for you.