in reply to Massive Perl Memory Leak

Well, looking at the pseudocode it all depends on how much data you're putting in %datahash.

100 threads taking up 2 Gb is "only" 20 Mb per thread. the thread-local data in %datahash alone could account for that. As noted above, you should not count on perl sharing any data between threads (not even internally) unless you explicitly arrange for that. In other words, perl threads behave much more like separate processes than most other thread implementations do.

It's easy enough to check if that's the problem: reduce the number of threads. If the amount of memory taken is stable relative to the number of threads there probably isn't a leak, you're just suffering from perl's memory-hungry approach to problem solving.

update: I just want to note that using threads in this program doesn't seem to be necessary (assuming you're using threads to run multiple SNMP connections in parallel). It's perfectly possible (and probably more efficient) to do that in a single thread. For instance, you could use non-blocking IO and select() instead. I believe that Net::SNMP even provides a callback/non-blocking interface. See also POE::Component::SNMP.

Replies are listed 'Best First'.
Re^2: Massive Perl Memory Leak
by wagnerc (Sexton) on Jun 12, 2007 at 19:11 UTC
    The %datahash hash only accumulates 50-150K depending on the device and it's blown away at each iteration. The script saves very little data persistently. I monitor those variables for size and they don't grow out of control. Watching the script with top shows the memory usage grow geometrically. Slow at first, then going up 1-5 meg every few seconds. The memory usage accelerates.

    I have a previous version of the script that doesn't have the memory leak. The only real difference is the use of a central datahash to keep everything. The old script just directly prints everything. That's why I think this is the culprit.

    I wrote a test script that used Padwalker to check on the my'd variables with peek_my and then Data::Dump that hash. That showed something very interesting. After I assigned the independent hashes into the datahash (%{$datahash{"branch"}} = %branchdata;) the dump showed both hashes were refering to the same data. It wasn't a pure copy. Like:

    do { my $a = { "%cdp" => { "1.4" => { id => "switch", ip => "1.2.3.4", platform => "WS-C6509", } }, "%datahash" => { cdp => { "1.4" => 'fix', "1.5" => 'fix' }, }, }; $a->{"%datahash"}{cdp}{"1.4"} = $a->{"%cdp"}{"1.4"}; };

    Does this shed any light on anything? Thanks.

      If %branchdata contains references to hashes or arrays, then doing a shallow copy into a key of %datahash will result in the sharing (not in the threads::shared sense!) of the referenced data between the two structures. Eg.

      %a = ( 1, {a=>b=>c=>d=>}, 2 =>[ 1..5 ] );; %{ $b{ copy } } = %a;; print Dumper \%a, \%b;; $VAR1 = { '1' => { 'c' => 'd', 'a' => 'b' }, '2' => [ 1, 2, 3, 4, 5 ] }; $VAR2 = { 'copy' => { '1' => $VAR1->{'1'}, '2' => $VAR1->{'2'} } };

      To deep copy compound structures, you need to use Clone or (as someone else advised earlier) Storable::dclone().

      But whether that has anything to do with your memory growth is impossible to tell from the code shown. If both of these structures are lexical and being garbage collected, then that (shallow copying) should not be the cause the symptoms you are describing.

      It sounds like you may be creating some kind of a circular reference somewhere. This could cause the kind of accelerating memory growth you mention. Maybe. But again, it's just guesswork without seeing the code.

      And the problem does not seem to be related to your use of threads. Though it is obviously exaggerated by there being 100 times as many 'programs' all doing the same bad thing--whatever that is.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Right now I'm reverting to rebuilding the script block by block from the old version that doesn't spiral out of control. Now that version's no slouch either. It grows a good bit but it stays reasonable by Perl standards. Say 33% growth from start to completion. Most of the code is the same, and is of the same type, but whatever changed, changed the mem growth to 400%.

        Speaking of garbage collection, is there any way to look into Perl to see just what when and where things are garbage collected? I've had, and still have, this suspicion that Perl is just not garbage collecting the way it should.

      I have a previous version of the script that doesn't have the memory leak. The only real difference is the use of a central datahash to keep everything. The old script just directly prints everything.
      Can you show the parts of your non-leaking script that are equivalent to what you show in your initial question?