in reply to Re^6: Massive Perl Memory Leak
in thread Massive Perl Memory Leak

I don't see anything dramatically wrong with your threading code. There are some anomalies, like why you take a reference $session to the session object $session0, and then indirect through that instead of using the latter directly--but I'll put that down to artifacts of trimming your real code down for test purposes.

I'm also personally uncomfortable with code that doesn't use strict, and relies on package globals for communications between subs, but it's your code. There are a few errors that would show up were you using strict and warnings. For example, you have

my $result6; $result6 = $$session->get_entries(-columns => [ $ipCidrRouteIfIndex, $ipCidrRouteProto, $ipCidrRouteType ]); $result6 = $$session->get_entries(-columns => [ $ipRouteNextHop, $ipRouteIfIndex, $ipRouteMask, $ipRouteProto, $ip +RouteType ]) unless %$result6;

If the first call fails and returns undef, then %$result6; would produce a warning Use of uninitialized value in hash dereference ... if they were enabled. Better to use just unless $result6 or as you have on the next line unless defined $result6;, but it probably doesn't affect the result.

More important is what goes on under the covers of Net::SNMP--though take anything I say with a handful of salt as I have no devices to run it against and it's a huge amount of complicated code to run in your head.

The main comment I have is that it is never intended for use with threads. A large part of what makes it so complex, is that it has it's own built in select-based, event driven, multiprocessing model.

Ostensibly, you only need to add -nonblocking   => 1, to the constructor, and -callback => \&somesub, to the get_entries() calls, and you can overlap all your requests without using threads.

Of course, with your program being 100,000 lines, it indicates that you are doing some pretty complex and probably time consuming processing of the data you are getting back. If you just stuck all that into the callback, the timeliness of the event-driven model will go right out the window and the performance will suck. And trying to break your code up into small chunks to maintain the responsiveness of the event loop, without your being able to register your own code with their internal event queues, means that you would effectively have to build a second event-loop model on top, within your code.

The problem of the memory growth arises because they are using several 'singleton' instances internally.

And there are no less that 12 packages in the suite, and many of these load others--often large--packages, Math::BigInt, Digest::MD5 etc. And all of these, and all of the global data represented by the singleton objects and global tables above are being loaded and duplicated into every one of your 100 threads.

In essence, you are building a program that contains 100 event-loops, and then running each as a 1-shot event.

I also suspect, but cannot confirm from a dry run analysis, that the memory growth occurs because global data-structures get built and then replicated across 100 threads, but that the clean-up of the memory associated with any single request will only happen in the thread that processes it. Ie. Each request causes 100 copies of the data structures required to service it, but only one of those copies gets cleaned up upon completion, leaving 99 dead copies of everything from your 1000 requests kicking around doing nothing but taking up space.

It would certainly explain the growth.

The upshot.

You either need to forget using threads, use the built-in non-blocking multi-processing model, with all the inherent difficulties of trying to break your extensive post-processing into bite-size chunks to maintain the responsiveness of the internal event loop. Which probably means building a second event-loop mechanism in your own code.

Or, entreat on the authors to move away from using singletons, global data structures and the (redundant) event loop comms model, when running in blocking mode.

Neither is a particularly enticing decision.

Update: There is another alternative. Run the Net::SNMP event loop in non-blocking mode in the base thread. Have the callbacks simply queue the returned data and return. Start a number of threads reading from the queue of response data to do the time consuming processing. It would be important to ensure that your processing threads were started before you loaded Net:SNMP and all it's dependancies in the main thread, to avoid all the code being replicated into the worker threads. Ie. require not use.

This represents a fairly major restructuring of your existing code, but far less than either of the alternatives above.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^8: Massive Perl Memory Leak
by wagnerc (Sexton) on Jun 18, 2007 at 21:24 UTC
    Fascinating. But what I don't get is why any of the singleton objects would survive (to do bad things) thread creation, since each thread is supposed to have its own compartmentalized copy of the interpretor. No thread is supposed to "know" anything about any of the data in the other threads or be able to do anything to another thread without an explicit share(). The singleton's should as well be cloned and not be able to touch each other. That's the definition of the threading model.

    I was saying that the script is 100 KB, not 100 K-lines. :P And ur right about the trim down artifacts. $session is a ref to the object as created in another namespace. I moved that into main for the test script. %$result6 is another experiment. The original code is || seperated.

      The singleton's should as well be cloned and not be able to touch each other.

      I said I suspect but cannot prove. Without 1000 SNMP devices (or even one I could call 1000 times) all I can do is speculate. I agree it shouldn't happen, but are there any occasions or circumstances where a variable becomes shared without explicit sharing?

      Well, take the example of any parameters you pass to your thread code on the threads->create(), you don't need to share those. But that's a special case. How about your leaker3.pl above. You create your queue in the main thread and never share it:

      my $queue = new Thread::Queue; $queue->enqueue( @ips[ 0 .. $howmany ] );

      but there you are using it inside the thread subroutine:

      while( my $string = $queue->dequeue_nb ) {

      If that was shared without explicit instruction, what else was?

      Do me a favour and try this. All the non-blocking SNMP code is derived purely from the docs, which I may have misundertood. It compiles clean but is otherwise untested so it may require a few tweaks. It is basically the third option I mentioned above that use threads to rescue the calling code from the binds of the underlying event-driven state machine.

      #! perl -slw use strict; use threads; use Thread::Queue; use Storable qw[ freeze thaw ]; use constant { ipCidrRouteProto => "1.3.6.1.2.1.4.24.4.1.7", ipCidrRouteIfIndex => "1.3.6.1.2.1.4.24.4.1.5", ipCidrRouteType => "1.3.6.1.2.1.4.24.4.1.6", ipRouteMask => "1.3.6.1.2.1.4.21.1.11", ipRouteIfIndex => "1.3.6.1.2.1.4.21.1.2", ipRouteType => "1.3.6.1.2.1.4.21.1.8", ipRouteNextHop => "1.3.6.1.2.1.4.21.1.7", ipRouteProto => "1.3.6.1.2.1.4.21.1.9", }; sub warnf{ warn sprintf $_[0], @_ } my $THREADS = $ARGV[ 0 ] || 100; my $Q = new Thread::Queue; ## The main processing goes here sub processResults{ my( $host, $commstr, $result ) = @_; ## do your heavy stuff here } ## simple worker thread sub worker { while( my $result = $Q->dequeue ) { my $result = thaw $result; my $host = delete $result->{ _my_private_host_key }; my $commstr = delete $result->{ _my_private_commstr_key }; processResult( $host, $commstr, $result ); } } ## Callback sub enqueue { ## First arg is the seesion handle--the rest whatever we asked for my( $session, $host, $commstr ) = @_; my $result = $session->var_bind_list(); ## Get the results + hash $result->{ _my_private_host_key } = $host; ## Tack on some ex +tras $result->{ _my_private_commstr_key } = $commstr; $Q->enqueue( freeze $result ); ## And serialise it for qu +euing $session->close; } my @workers = map{ threads->create( \&worker ); } 1 .. $THREADS; ## Avoid loading the heavy stuff until after we've spawned out threads +; require Net::SNMP; open my $ipsFH, '<', 'allips.txt' or die $!; while( my( $host, $commstr ) = split ' ', <$ipsFH> ) { my( $session, $error ) = Net::SNMP->session( -hostname => $host, -community => $commstr, -port => 161, -version => 'SNMPv1', -translate => 1, -debug => 0x00, -nonblocking => 1, -timeout => 30, -retries => 3, ); unless( defined $session ) { warn "Couldn't create session for $host/$commstr: reason: $err +or\n"; next; } my $success = $session->get_entries( -columns => [ ipCidrRouteIfIndex, ipCidrRouteProto, ipCidrRouteType ], ## Have host and commstr passed back to the callback -callback => [ \&enqueue, $host, $commstr ], ) and next; $success = $session->get_entries( -columns => [ ipRouteNextHop, ipRouteIfIndex, ipRouteMask, ipRouteProto, ipRouteType ] ## Have host and commstr passed back to the callback -callback => [ \&enqueue, $host, $commstr ], ) or warnf "Failed to get_entries for $host/$commstr: reason %s\n" +, $session->error(); } close $ipsFH; ## Run the event loop Net::SNMP->snmp_dispatcher(); ## Wait for the work to be done sleep 1 while $Q->pending; ## signal workers to die $Q->enqueue( ( undef ) x $THREADS ); ## And bury them $_->join for @workers;

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        I started ur script. I'm not sure what it'll show but it's running and I'm monitoring it. :) I added a printlock and made the processResults() just print a status message. There is one problem with ur code, the second SNMP call will never be made. Nonblocking calls always return true. That would be a big problem in production code since we don't know which one we need ahead of time. I would have to spin the dispatcher, see if the oids were defined, and then spin it again possibly. The other major problem with the spin dispatcher method is that I don't know what IP's to poll ahead of time. That's one of the reasons I did it as stand alone threads with blocking SNMP. I wouldn't be able to queue up tons of SNMP calls and then spin the dispatcher and wait for the data. I'ld have to queue up small bits of calls and spin the thing over and over again. So it doesn't really buy me anything. :)

        Ok the script completed. It only used about 300 MB of memory. As I expected the non CIDR devices returned nothing. :) I'ld be interested to see how the nonblocking calls behave within worker threads.. Hmm.

        Another thing I tried. In my leaker3 I moved the require Net::SNMP down into the entrypoint as eval "require Net::SNMP"; figuring that way there would be *no* objects to clone since the entire module wasn't instantiated yet. Doing that used up even more memory than require'ing it at the top of the script. :P