Massive Perl Memory Leak

by wagnerc (Sexton) on Jun 11, 2007 at 17:43 UTC

Thanks for getting back to me. I'm using the local %datahash construct so I can avoid gotchas with passing references (but it seems I've run into another gothcha somewhere else). %datahash is essentially just a thread global and the local is there to automatically garbage collect it as the forever loop restarts. I can't use my because the variable wouldn't be visible to the descendant subroutines. Now correct me if I'm wrong, but isn't %datahash garbage collected the moment the forever loop loops? It's not creating *new* %datahashes and keeping the *old* %datahash's stashed away right? I'm manually walk deleting everything out of %datahash at the end of the loop so either way it shouldn't be building up memory.

by halley (Prior) on Jun 11, 2007 at 18:07 UTC

   $foo = 5;
   sub do_something { print $foo, $/; }
   sub main
   {
      do_something();
      {
        local $foo = 6;
        do_something();
      }
      do_something();
   }
   main();
[download]

local

      {
        my $unnamed_backup_of_old_value_of_foo = $foo;
        $foo = 6;
        do_something();
        $foo = $unnamed_backup_of_old_value_of_foo;
      }
[download]

The actual storage of $foo is the first statement. It is what gets changed whenever you see assignments being done.

This of course can get very big and hairy with a huge hash of data to "backup" and "restore" in the equivalent %unnamed_backup_of_old_value_of_datahash. Also, deleting items from the local hash would not have much bearing, since the whole hash will just get tossed out and restored from the backup.

If you're doing local from multiple threads, well, you can see where your memory is going. Secondly, I can quite easily imagine a race condition where your main shared global hash is getting "restored" in the wrong order, leading to pandemonium.

--
[ e d @ h a l l e y . c c ]

by wagnerc (Sexton) on Jun 11, 2007 at 18:32 UTC

by BrowserUk (Patriarch) on Jun 11, 2007 at 19:50 UTC

Some notes below your chosen depth have not been shown here

by GrandFather (Saint) on Jun 11, 2007 at 19:01 UTC

I can't answer the garbage collection issue except to say what you probably already know: Perl deletes objects' memory when it is no longer referenced.

However using global variables to "pass" information between subs is bad, bad, bad! Instead pass a reference to the descendant subs so it is clear where information is being used and possibly modified. If you find you are passing a lot of parameters consider using light weight object oriented techniques by simply blessing a hash into the current package then calling the descendant subs on the object:

my $obj = bless {datahash => \%datahash, ...};

$obj->sub1 ();
$obj->sub2 ();
...

sub sub1 {
    my $self = shift;

    $self->sub3 (wibble => 3);
}

...

sub sub3 {
    my ($self, %params) = @_;

    for my $key (keys %{$self->{datahash}}) {
        my $value = $self->{datahash}{$key} * $params->{wibble};
        ...
    }
}
[download]

and if you have trouble with getting references going ask! Writing nasty code to avoid learning a new and valuable technique will not reward you over the long run, and by the sound of it not even in the short run. You may find References quick reference helps.

DWIM is Perl's answer to Gödel

by wagnerc (Sexton) on Jun 11, 2007 at 19:45 UTC

%{$datahash{"devinfo"}} = %devinfo; I'm curious as to how the deep nature of %devinfo is copied over to the %datahash branch. As in do any refs to the original var survive? Or is that a 100% clean copy with no strings attached.

Would either of these be equivalent or better syntax? :

 $datahash{"devinfo"} = \%devinfo;
 $datahash{"devinfo"} = %devinfo; # this must be wrong
[download]

by GrandFather (Saint) on Jun 11, 2007 at 21:15 UTC

Consider:

use strict;
use warnings;
use Data::Dump::Streamer;

my %devinfo = (1 => {a => 'apple', b => 'orange'});
my %datahash = (devinfo => {});

%{$datahash{"devinfo"}} = %devinfo;
Dump (\%devinfo, \%datahash, $datahash{devinfo});
[download]

Prints:

$HASH1 = { 1 => {
           a => 'apple',
           b => 'orange'
         } };
$HASH2 = { devinfo => 'A: $HASH3' };
$HASH3 = { 1 => $HASH1->{1} };
alias_hv(%$HASH2, 'devinfo', $HASH3);
[download]

The copy is a shallow copy. If any of the values of %devinfo are references then the copy simply duplicates the reference. You may find Storable's dclone helps if you are looking for a clone of the data.

DWIM is Perl's answer to Gödel

Re: Massive Perl Memory Leak
by Joost (Canon) on Jun 11, 2007 at 22:25 UTC

100 threads taking up 2 Gb is "only" 20 Mb per thread. the thread-local data in %datahash alone could account for that. As noted above, you should not count on perl sharing any data between threads (not even internally) unless you explicitly arrange for that. In other words, perl threads behave much more like separate processes than most other thread implementations do.

It's easy enough to check if that's the problem: reduce the number of threads. If the amount of memory taken is stable relative to the number of threads there probably isn't a leak, you're just suffering from perl's memory-hungry approach to problem solving.

update: I just want to note that using threads in this program doesn't seem to be necessary (assuming you're using threads to run multiple SNMP connections in parallel). It's perfectly possible (and probably more efficient) to do that in a single thread. For instance, you could use non-blocking IO and select() instead. I believe that Net::SNMP even provides a callback/non-blocking interface. See also POE::Component::SNMP.

"What should it profit a man, if he should win a flame war, yet lose his cool?"

by wagnerc (Sexton) on Jun 12, 2007 at 19:11 UTC

accelerates

I have a previous version of the script that doesn't have the memory leak. The only real difference is the use of a central datahash to keep everything. The old script just directly prints everything. That's why I think this is the culprit.

I wrote a test script that used Padwalker to check on the my'd variables with peek_my and then Data::Dump that hash. That showed something very interesting. After I assigned the independent hashes into the datahash (%{$datahash{"branch"}} = %branchdata;) the dump showed both hashes were refering to the same data. It wasn't a pure copy. Like:

do {
  my $a = {
    "%cdp"    => {
              "1.4" => {
                         id => "switch",
                         ip => "1.2.3.4",
                         platform => "WS-C6509",
                       }
    },
    "%datahash"   => {
                  cdp => { "1.4" => 'fix', "1.5" => 'fix' },
    },
  };
  $a->{"%datahash"}{cdp}{"1.4"} = $a->{"%cdp"}{"1.4"};
};
[download]

Does this shed any light on anything? Thanks.

"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

by BrowserUk (Patriarch) on Jun 12, 2007 at 20:03 UTC

If %branchdata contains references to hashes or arrays, then doing a shallow copy into a key of %datahash will result in the sharing (not in the threads::shared sense!) of the referenced data between the two structures. Eg.

%a = ( 1, {a=>b=>c=>d=>}, 2 =>[ 1..5 ] );;
%{ $b{ copy } } = %a;;

print Dumper \%a, \%b;;
$VAR1 = {
          '1' => {
                   'c' => 'd',
                   'a' => 'b'
                 },
          '2' => [
                   1,
                   2,
                   3,
                   4,
                   5
                 ]
        };
$VAR2 = {
          'copy' => {
                      '1' => $VAR1->{'1'},
                      '2' => $VAR1->{'2'}
                    }
        };
[download]

To deep copy compound structures, you need to use Clone or (as someone else advised earlier) Storable::dclone().

But whether that has anything to do with your memory growth is impossible to tell from the code shown. If both of these structures are lexical and being garbage collected, then that (shallow copying) should not be the cause the symptoms you are describing.

It sounds like you may be creating some kind of a circular reference somewhere. This could cause the kind of accelerating memory growth you mention. Maybe. But again, it's just guesswork without seeing the code.

And the problem does not seem to be related to your use of threads. Though it is obviously exaggerated by there being 100 times as many 'programs' all doing the same bad thing--whatever that is.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

by wagnerc (Sexton) on Jun 12, 2007 at 21:16 UTC

by chromatic (Archbishop) on Jun 12, 2007 at 21:50 UTC

Some notes below your chosen depth have not been shown here

by ysth (Canon) on Jun 12, 2007 at 22:25 UTC

I have a previous version of the script that doesn't have the memory leak. The only real difference is the use of a central datahash to keep everything. The old script just directly prints everything.

"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Re: Massive Perl Memory Leak
by BrowserUk (Patriarch) on Jun 11, 2007 at 17:48 UTC

Working code is always preferable to pseudo-code.

But...does it make any difference if you don't use Data::Dump::dump()? Eg. Just print the hash rather than dumping it. My though is that is may not be thread-safe and is accumulating internal data (say, for circular reference detection) that isn't being cleaned up or is getting replicated.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

by wagnerc (Sexton) on Jun 11, 2007 at 18:08 UTC

The %datahash tree essentially looks like:

(
  "index",
  {
    baseip   => "1.2.3.4",
    deviceid => "81F1FD9964973BB63B3446881F901AF5",
    sysname  => "router",
  },
  "routes",
  {
    "0.0.0.0.0.0.0.0.2.3.4.5"         => {
                                             ifindex => 0,
                                             mask    => "0.0.0.0",
                                             net     => "0.0.0.0",
                                             nexthop => "2.3.4.5",
                                             proto   => 3,
                                             type    => 4,
                                           },
  },
  "interfaces",
  {
    1 => {
           ARP           => 3,
           ifAdminStatus => 1,
           ifAlias       => "",
           ifDescr       => "FastEthernet0/0",
           ifIndex       => 1,
           ifOperStatus  => 1,
           ifSpeed       => 100_000_000,
           ifType        => "ethernetCsmacd",
         },
[download]

"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

by BrowserUk (Patriarch) on Jun 11, 2007 at 18:22 UTC

he hash can't be printed otherwise. It's a deep tree and just print'ing it would just give the top level keys...

I meant avoid the use of dump just temporarily to see if it is the cause of the memory growth.

If avoiding it fixed the immediate problem, then you could look for another way of recursively printing deep data structures that didn't suffer the same problem. Maybe Data::Dumper, or Data::Dump::Streamer or even rmap with a little logic of your own.

But first you need to track down the source of the problem. You could equally just comment out the print altogether.

Is %datahash shared?

What's in the hash doesn't matter.

There is no logic I know of for using local rather than my in threads.

Without sight of the full program (stick it on your scratchpad and /msg me if you don't want to post it), or a cut down but working version that demonstrates the same symptoms, it's really impossible to offer much in the way of help.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

Re: Massive Perl Memory Leak
by perrin (Chancellor) on Jun 11, 2007 at 18:54 UTC

I don't know much about how thread data gets garbage collected. I do know they use a lot of memory though, since every time you make a new thread it copies all of the data structures in your perl interpreter. They are not shared, except for the ones you specifically mark.

In addition, the way that memory allocated for perl structures gets returned when they go out of scope is not always intuitive. The memory typically stays allocated after the variable goes out of scope. But I don't know how this interacts with threads.

Were you using local %datahash because you thought you needed that to prevent it from being shared by threads? You can actually just use our %datahash, since nothing is shared. Each thread will have a separate copy automatically.

by wagnerc (Sexton) on Jun 12, 2007 at 22:09 UTC

I was using local %datahash; as a cheap way of reinitializing the variable at each loop. Simply doing %datahash = (); would be functionally identical for my requirements.

Re: Massive Perl Memory Leak
by Zen (Deacon) on Jun 11, 2007 at 19:49 UTC

"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Re: Massive Perl Memory Leak
by BrowserUk (Patriarch) on Jun 12, 2007 at 23:45 UTC

I have another suggestion. I should have thought to mention it earlier.

Modify the program to call the thread code directly from main and comment out the thread creation step.

If the 'leak' is associated with the threads it should 'go away' completely, but my bet is that it will still happen--only 100 times more slowly. Once you've fixed the leak in the non-threaded version, the threaded version won't leak either.

The simplest and best way to debug threaded code is to make the thread functions work indivually as non-threaded functions first. And only once you are happy they are working correctly, spawn copies.

This is so ingrained in me, that it didn't cross my mind to mention it before now.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

by wagnerc (Sexton) on Jun 14, 2007 at 14:25 UTC

But if you'd posted the code 3 days ago as asked, your problem would probably be fixed by now.

I think I've found something with my rebuild effort. A code block that when commented out, the script doesn't leak. I really defy anyone to find anything syntacticly wrong with this code. :)

## all in ifpoll
local %routes = ();
my $result6;
## device can have CIDR or non-CIDR routing
$result6 = $$session->get_entries(-columns => [$ipCidrRouteIfIndex, $i
+pCidrRouteProto, $ipCidrRouteType ]);
$result6 = $$session->get_entries(-columns => [$ipRouteNextHop, $ipRou
+teIfIndex, $ipRouteMask, $ipRouteProto, $ipRouteType]) unless %$resul
+t6;
if (!defined($result6)) {
      printf(STDERR "ERROR(routes): %s %s.\n", $devarg, $$session->err
+or);
}

local $testkey = each %$result6;
if ($testkey =~ m/^\Q1.3.6.1.2.1.4.24.4.1.\E/o) {
    print "Entering CIDR parsing\n";
    foreach $key (keys %$result6) {
        if ($key =~ m/^\Q$ipCidrRouteProto\E/o) {
            my @temp = $key =~ m/^\Q$ipCidrRouteProto\E\.(\d{1,3}\.\d{
+1,3}\.\d{1,3}\.\d{1,3})\.(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\.0\.(\d
+{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/o;
            my $id = join ".", @temp;
            $routes{$id}{"proto"} = ${$result6}{$key};
        }
        elsif ($key =~ m/^\Q$ipCidrRouteType\E/o) {
            my @temp = $key =~ m/^\Q$ipCidrRouteType\E\.(\d{1,3}\.\d{1
+,3}\.\d{1,3}\.\d{1,3})\.(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\.0\.(\d{
+1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/o;
            my $id = join ".", @temp;
            $routes{$id}{"type"} = ${$result6}{$key};
        }
        elsif ($key    =~ m/^\Q$ipCidrRouteIfIndex\E/o) {
            #                                                    dest 
+network                            dest mask                         
+           next hop
            my @temp = $key =~ m/^\Q$ipCidrRouteIfIndex\E\.(\d{1,3}\.\
+d{1,3}\.\d{1,3}\.\d{1,3})\.(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\.0\.(
+\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/o;
            my $id = join ".", @temp;
            $routes{$id}{"net"} = $temp[0];
            $routes{$id}{"mask"} = $temp[1];
            $routes{$id}{"nexthop"} = $temp[2];
            $routes{$id}{"ifindex"} = ${$result6}{$key};
        }
    }
}
elsif ($testkey =~ m/^\Q1.3.6.1.2.1.4.21.1.\E/) {
    print "Entering non CIDR parsing\n";
    foreach $key (keys %$result6) {
        my $index;
        if (($index) = $key =~ m/^\Q$ipRouteNextHop\E\.(.+)/o) {
            #print "Found next hop ${$result6}{$key} for $index in $ke
+y\n";
            $routes{$index}{"nexthop"} = ${$result6}{$key} ; #|| "NULL
+" or die "route assignment failed\n";
            $routes{$index}{"net"} = $index;
        }
        elsif (($index) = $key =~ m/^\Q$ipRouteIfIndex\E\.(.+)/o) {
            $routes{$index}{"ifindex"} = ${$result6}{$key}; # || "NULL
+" or die "route assignment failed\n";
            #print "Found destination ${$result6}{$key} for $index in 
+$key\n";
        }
        elsif (($index) = $key =~ m/^\Q$ipRouteMask\E\.(.+)/o) {
            $routes{$index}{"mask"} = ${$result6}{$key}; # || "NULL" o
+r die "route assignment failed\n";
            #print "Found mask${$result6}{$key} for $index in $key\n";
        }
        elsif (($index) = $key =~ m/^\Q$ipRouteProto\E\.(.+)/o) {
            $routes{$index}{"proto"} = ${$result6}{$key}; # || "NULL" 
+or die "route assignment failed\n";
            #print "Found mask${$result6}{$key} for $index in $key\n";
        }
        elsif (($index) = $key =~ m/^\Q$ipRouteType\E\.(.+)/o) {
            $routes{$index}{"type"} = ${$result6}{$key}; # || "NULL" o
+r die "route assignment failed\n";
            #print "Found mask${$result6}{$key} for $index in $key\n";
        }

    }
}

%{$datahash{"routes"}} = %routes;
[download]

Any ideas?

"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

by BrowserUk (Patriarch) on Jun 14, 2007 at 18:11 UTC

Best guess. If you just commented out the two lines that call $$session->get_entries(...), you'd still not see the memory growth. Not that that would be surprising, as the rest of the code wouldn't be doing much of anything.

Looking at it from the other direction. Leave all the other code commented out and just call whichever of those two lines is appropriate--but do nothing with the results returned from the call--and the memory growth will return.

$session is a reference to the Net::SNMP object

Is that module thread safe? Are you reusing a single object to talk to multiple devices? Are you sharing one instance between threads?

You might be able to check whether the session object is accumulating data internally long after you have finished with it by using Devel::Size on $session after each call to get_entries()

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

by wagnerc (Sexton) on Jun 14, 2007 at 18:30 UTC

by BrowserUk (Patriarch) on Jun 14, 2007 at 23:29 UTC

Some notes below your chosen depth have not been shown here

Re: Massive Perl Memory Leak
by Jenda (Abbot) on Jun 13, 2007 at 14:41 UTC

1) Don't use & when calling subroutines. And if you do, DO MAKE SURE you do not use it when calling a subroutine with no parameters! Unless you know what that means.

2) Don't use

%{$datahash{"devinfo"}} = %devinfo;
[download]

$datahash{"devinfo"} = \%devinfo;
[download]

3) Don't use the global. Pass a reference to the %datahash

for (;;) {
 my %datahash;
 ..getdevicetopoll..
 ifpoll( \%datahash, $dev);
 print FILE dump %datahash;
}
...
sub ifpoll {
 my ($datahash, $dev) = @_;
 my %devinfo;
 my %interfaces; ## etc...
 ..snmp a lot of data here..
 $datahash->{"devinfo"} = \%devinfo;
 $datahash->{"interfaces"} = \%interfaces;
 return;
}
[download]

Neither of those changes should affect the leak though, it must be somewhere else in the code. Are you sure you are not producing any cyclic references in the part of the code you did nto show us?

Jenda
Support Denmark!
Defend the free world!

by wagnerc (Sexton) on Jun 13, 2007 at 19:37 UTC

Oh so Perl flattens hash assignments? I thought it was smart enough to know u were assigning a hash to a hash.

UK: The problem with validating as single threaded is that the leak is obscured in Perl's normal bloat. And the number of iterations needed to show a noticable mem increase would take a *long* time. Even with 100 threads it doesn't go into the death spiral for 20-30 minutes. But I know what ur saying.

by Jenda (Abbot) on Jun 13, 2007 at 22:04 UTC

1) Well, if you feel better using the & go ahead, just keep in mind that &foo(); and &foo; are two very different things. The first statement calls the foo() subroutine with no parameters while the second one calls it aliasing its @_ to the current @_. That is any changes to @_ made within foo() will affect the current subroutine!

2) No, the my affects only the variable, no its contents. So the ifpoll() receives a reference to the caller's %datahash and assigns the reference to its own variable $datahash. The my itself doesn't affect the %datahash in any way.

3) Per has to flatten hash assignments. It might make some optimizations like preallocating the right number of buckets, but it has to create a new hash anc copy the values. So that changing one hash doesn't affect the other.
The %newhash = %oldhash; makes a copy, not another name for the same "object".

Jenda
Support Denmark!
Defend the free world!

by wagnerc (Sexton) on Jun 14, 2007 at 00:58 UTC