I have been working on some code to handle to kill a run-away process. One of the tricks to this solution is that the run-away process likes to fill up the process table. This means my assassin cannot rely on forking, system calls, etc.

To get around this, I downloaded and installed the Solaris::Procfs module and figured out how to use it. Everything looked fine and I started the daemon in a test mode.

Imagine my surprise when I checked the next morning and the process had a size of 53Mb and an RSS of 53Mb. Obviously, I have a bad memory leak.

I have looked over my code and have cleaned the obvious problems out. I fear the leak may actually be in the Solaris::Procfs module, which uses a fair amount of XS.

My questions are:

I have thought about running the daemon and having it spawn children to do its job. If a child cannot be spawned, the daemon will handle the job itself. This will not fix the leak, but should slow it down.

The code should ( hopefully ) be beneath the readmore tag.
TIA,
mikfire

#!/os/dist/bin/perl -w use Solaris::Procfs qw/:procfiles/; # Get a ps -auxw w/o shelling use Net::SNPP; # Messages to pagers use Sys::Syslog; # All other messages use Data::Dumper; my $killing; my $DEBUG = 1; #--- # There is a very nasty race condition I am trying to avoid. It coul +d be # possible that, between the call to get_procfs and the kill more pro +cesses # could be spawned. We need to make sure we kill them all. Thus, th +e # killing flag is set and we will kill until there are no more to be +killed. # # The loop has three basic portions: # 1. If the killing flag isn't set and the number of running ypserv +'s # is between 1 and 8, we sleep and loop. # 2. If there are no ypserv's running, we fork a child and exec a # ypstart in the child. When the child is reaped, we return to +the # top of the loop # 3. Otherwise, we kill everything named ypserv. I sort on the PPI +D, # hoping to kill the one spawning everybody else first. #--- while ( 1 ) { my %procs = (); get_procs(\%procs); my $procnum = keys %procs; #--- # Nothing wrong, go back to sleep #--- unless ( $killing || $procnum > 8 ) { LogIt( 'Nothing to kill', 'info' ); sleep 180; next; } #--- # If no ypservs are found, then we likely killed them all and it i +s time # to ( hopefully ) fork a new ypserv process. #--- unless ( $procnum ) { #--- # We will do the fork and exec by hand. That way we can flag erro +rs # and attempt to be smart. #--- my ($kid,$dead,$looped); if ( $DEBUG ) { LogIt( 'Would have forked ypstart', 'debug' ); sleep 300; next; } if ( $kid = fork ) { do { select(undef,undef,undef,0.5); $looped++; $dead = waitpid(-1,1); } until $dead == -1 || $looped > 20; if ( $looped > 20 ) { LogIt( "Could not reap ypstart", 'err' ); next; } $killing = 0; } elsif ( defined( $kid ) ) { exec "/usr/lib/netsvc/yp/ypstart"; } else { LogIt("Couldn't fork: $!",'err'); } } #--- # Start with the cement shoes and overcoats. I try to get the ( I + hope ) # ring leader first - which will have, I assume, a PPID of 1. #--- if ( $DEBUG ) { LogIt( "$procnum ypservs running, would have waxed $total", 'debug +' ); sleep 300; next; } $killing = 1; $total = kill 9, sort { $procs{$a}{PPID} <=> $procs{$b}{PPID} } ke +ys %procs; LogIt( "$procnum ypservs running, $total waxed", 'warning' ); } #--- # All the functions for Procfs seem to rely upon the pid. The only wa +y I can # think of getting this information is to grope /proc and get the info +rmation # for each directory I find. #--- sub get_procs { my $ref = shift; print localtime, "opening procfs\n"; opendir PIDS, "/proc" or die "Coudln't open /proc: $!"; while ( $pid = readdir PIDS ) { next if ( $pid =~ /^\.\.?\z/ ); my $info = psinfo $pid; unless ( defined( $info ) && ref $info eq 'HASH' ) { warn "Couldn't grope pid $pid: $!\n"; next; } #--- # Do nothing unless this is a ypserv #--- next unless ( $info->{pr_fname} eq 'ypserv' ); $ref->{PID} = $pid; $ref->{PPID} = $info->{pp_ppid}; $info = undef; } closedir PIDS; } #--- # Stupid logging function - sends a page for anything more important t +han info # and syslogs them all #--- sub LogIt { my ($message, $level ) = @_; my $paged = 0; $level ||= 'daemon'; if ( $level eq 'debug' ) { my $snpp = Net::SNPP->new('snpphost'); $paged = $snpp->send( Pager => 'mik', Message => $message, ); return 1; } unless ( $level eq 'info' ) { my $snpp = Net::SNPP->new('snpphost'); $paged = $snpp->send( Pager => 'sysadm', Message => $message, ); } openlog('Sacco', 'pid', 'daemon'); syslog( $level, "$message\n" ); syslog( 'err', "Paging failed\n" ) unless ( $level eq 'info' || $p +aged ); closelog(); }

In reply to Memory leaks by mikfire

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.