crunch_this! has asked for the wisdom of the Perl Monks concerning the following question:

So thanks to some users here I got my program working but I was told maybe it has a memory leak because with certain cases it will run for a couple hours & I'll get an out of memory error. & sure enough, using Test::LeakTrace (thx to this node http://perlmonks.org/?node_id=1001200 :) ) I found that a couple lines have leaking scalars & references. I tried the -verbose option but it was way over my head, not knowing much about the guts of perl. I could post what it says if it would help. Here's the suspect block:

use Math::Polynomial::Solve qw! poly_derivative poly_roots !; # $rep is the right endpoint of the interval [0, $rep] foreach my $a (3..$rep ) { foreach my $b (2..$a-1 ) { foreach my $c (1..$b-1 ) { # expanded form of x*(x - a)*(x - b)*(x - c) # coeffs are in an array my @quartic = (1, -$a - $b - $c, $a*$b + $a*$c + $b*$c, -$a*$b +*$c, 0); my @derivative = poly_derivative(@quartic); my @zeros = poly_roots(@derivative); $haystack{"$a, $b, $c"} = \@zeros; } } }

The problem lines according to Test::LeakTrace are

my @zeros = poly_roots(@derivative);

&

$haystack{"$a, $b, $c"} =  \@zeros;

Devel::Size tells me that, in one case anyway, the stuff that I want (the "needles" I guess) is roughly 236 bytes, but the %haystack, including all the information + references is close to 4MB! What can I do? Those two lines also happen to be the only ones containing @zeros & poly_roots, & if there's a problem with that Poly::Solve module I definitely don't want to be the one to mess with it.

PS- I've also tried Scalar::Util's weaken function on various things & it didn't seem to help. I wonder if I'm just not using it properly.

Replies are listed 'Best First'.
Re: help with memory leak
by hdb (Monsignor) on Apr 16, 2013 at 20:10 UTC

    Could you post a combination of parameters that causes problems? I guess $rep is sufficient.

    Another idea would be to isolate your code from the Poly::Solve module. In the current code they are somehow linked through the @zeroes array, even though this should not be. If you do

    $haystack{"$a, $b, $c"} = join "|", @zeros;

    as a test, this coupling should be much weaker as you only keep a string containing your results and not a reference.

      Changing \@zeros to join "|", @zeros fixes the leak on the line containing my @zeros = poly_roots(@derivative); but not the one with the hash definition. I have no idea why. Changing that also affects things later in the program, but right now I guess my priority is to fix this leak. & I added a line to the original message to explain what $rep is

        The "$a, $b, $c" portion of the hash uses scalars being passed in to the derivative function in @quartic. If you copy the loop counters like my $a_copy = $a and use those in @quartic, you may be able to bypass the scalar leak. I would say the module has some issues, but it looks like you can bypass them.

        Bioinformatics
Re: help with memory leak
by kcott (Archbishop) on Apr 17, 2013 at 05:34 UTC

    G'day crunch_this!,

    From Test::LeakTrace:

    "Leaked SVs are SVs which are not released after the end of the scope they have been created. These SVs include global variables and internal caches. For example, if you call a method in a tracing block, perl might prepare a cache for the method. Thus, to trace true leaks, no_leaks_ok() and leaks_cmp_ok() executes a block more than once."

    When I tried your code with leaktrace(), a large number of leaks (~50) were detected; however, with no_leaks_ok(), I got none. This could mean I am unable to reproduce your problem. Please advise what OS and Perl version you're using. Can you also show your testing code. I'm using:

    $ perl -v This is perl 5, version 14, subversion 2 (v5.14.2) built for darwin-th +read-multi-2level $ uname -a Darwin ganymede 11.4.2 Darwin Kernel Version 11.4.2: Thu Aug 23 16:25: +48 PDT 2012; root:xnu-1699.32.7~1/RELEASE_X86_64 x86_64

    The most obvious source of a memory leak in the code you posted is in the innermost loop, where you declare an array and take a reference to it. Every iteration through that innermost loop creates a new array with its own memory; this memory will not be released until %haystack goes out of scope (you don't show that in your code). So, @quartic and @derivative are both created afresh in the innermost loop and their memory should be freed after exiting that loop; however, multiple @zeros are created and their memory is not released because a reference is stored in %haystack.

    I'm going to suggest a partial solution whereby you don't create any arrays at all in the innermost loop; its partial because I don't know what you're doing with %haystack. The fact that you're not declaring and then assigning any arrays may give you a performance bonus: Benchmark will tell you.

    foreach my $x (3 .. $rep ) { foreach my $y (2 .. $x-1 ) { foreach my $z (1 .. $y-1 ) { # expanded form of x*(x - a)*(x - b)*(x - c) # coeffs are in an array $haystack{"$x, $y, $z"} = [ poly_roots( poly_derivative( 1, -$x - $y - $z, $x*$y + $x*$z + $y*$z, -$x*$ +y*$z, 0 ) ) ]; } } }

    [Side issue: You'll note I've changed $a, $b and $c to $x, $y and $z. $a and $b are special package variables and it's generally best not to use them except for their intended purpose (see perlvar for more details). $c is not special: I just changed that also for completeness. $x, $y and $z may not be the best choices (but $a and $b and definitely bad choices) — I'll leave you to decide what you want to call them.]

    It would be helpful if you provided a self-contained script that showed:

    • What sort of values are assigned to $rep
    • How and where you're using %haystack (beyond the assignment you show)
    • What tests you're running

    If your output is particularly lengthy, you can wrap it in <spoiler>...</spoiler> or <readmore>...</readmore> tags (see Markup in the Monastery for details). For your self-contained script, you'll find guidelines in: How do I post a question effectively?

    -- Ken

      Looks like I've got a slightly different version:

      perl -v This is perl 5, version 14, subversion 2 (v5.14.2) built for MSWin32-x +86-multi-thread

      uname -a didn't give me anything. I don't know what I did wrong

      Do you want to know what leaktest -verbose has to say? If there's a way to copy/paste it rather than type it by hand it would be great to know...

      What sort of values are assigned to $rep
      In this question & in testing I use small values (like <100) just to see how well it works. Once it's polished I want to use numbers basically a big as my computer can handle, like 1000 or bigger. So I expect the haystack to be huge even without any memory leaks. & if my computer can't handle that I'll go to some cloud computing thing to do it.

      What tests you're running
      In addition to the ones mentioned already, I've tried is_cycle_ok(), weaken() & I don't remember what else.

      How and where you're using %haystack (beyond the assignment you show)
      I was thinking of creating all polynomials with integer solutions, whose derivatives also have integer solutions. So a hash where the keys are the solutions of the polynomials and corresponding values are the solutions of the derivatives. The subroutine below searches the derivatives' zeros for zero-sets that only contain "approximte" integers. Then a hash slice gets created with the keys I want & Dumper prints it out. Here are the threads
      http://perlmonks.org/?node_id=1027571
      http://perlmonks.org/?node_id=1028273

      Here's the entire program, finished with assistance of user hdb who replied earlier in this node, & after changing a, b & c to x, y & z:

        Don't worry about uname. What you've shown indicates we're using the same Perl version but completely different OSes: I'm using Mac OS X; you're using MS Windows.

        Be aware how the number of iterations grows with respect to $rep: 4 = 4; 5 = 10; 10 = 120; 20 = 1140; 100 = 161700; and so on.

        You didn't indicate how the code changes affected the leaks. My tests still show no leaks — I'm unable to reproduce your problem.

        I can't see any leak-related isues with %haystack; although, that doesn't mean there aren't any. I do note that you're populating @wants twice with what appears to be the same data:

        my @wants = grep { is_approximately_an_integer( @$_ ) } values %haysta +ck; # intervening comments here only @wants = grep { is_approximately_an_integer( @{$haystack{$_}} ) } keys + %haystack;

        Obviously, you only want one of those. Also, if %haystack is only used to populate @wants (you don't show any other usage in your code), consider whether the is_approximately_an_integer() filter might be better placed in the innermost loop (possibly doing away with %haystack altogether).

        -- Ken