in reply to How to find a memory leak - appears to be "system" calls that are responsible

Well, if an external program fails due to lack of memory that doesn't mean it's the fault of the external program. If there's only little memory left, the external program will fail, but someone else is to blame.

You don't show code, you don't tell all the commands (nor their arguments) that are being called, and you don't tell us how many are called. So, it's just guessing. But when you said "files of 200-800M" and memory problems in the same sentence, I immediately started to wonder whether you suck in the entire file. That's going to take a lot of memory - which might cause other programs to fail.

Abigail

  • Comment on Re: How to find a memory leak - appears to be "system" calls that are responsible

Replies are listed 'Best First'.
Re: Re: How to find a memory leak - appears to be "system" calls that are responsible
by naum (Initiate) on May 15, 2004 at 04:53 UTC
    Basically, file is split by header records that start with '##', each file is "cut" and placed into an output directory.

    Initial testing as I built this script up has a simple ksh to read through the directory and submit another perl script to format, mail & compress. That process handled stress test very well.

    Splitter code

    sub endEmailPackage { my ($SPLITOUT, $splitoutfilename) = @_; print $SPLITOUT endLine(); close $SPLITOUT; my $subrc = system("MailPush $splitoutfilename"); if ($subrc == 0) { logit("MailPush $splitoutfilename submitted successfully"); } else { logit("Bad return code on submission of MailPush $splitoutfile +name, return code is $?"); } sleep 2; } sub endLine { return '##END' . (' ' x 75) . "\n"; } sub scrubHeaderParm { my ($href) = @_; foreach my $k (keys %{$href}) { $href->{$k} =~ s/^\s+//; $href->{$k} =~ s/\s+$//; } } } } sub splitupFile { my ($INFILE) = @_; seek $INFILE, 0, 0; my $SPLITOUT; my $splitoutfilename; while (<$INFILE>) { if (/^##A/) { my %hopt = /$headerregex/; logit($_); scrubHeaderParm(\%hopt); foreach my $k (keys %hopt) { logit("$k: $hopt{$k}"); } endEmailPackage($SPLITOUT, $splitoutfilename) if $addrecto +t > 0; $addrectot++; $splitoutfilename = "$prepdir/$hopt{ID}.$hopt{BATCHID}.$$" +; open($SPLITOUT, "> $splitoutfilename") or alert("$!"); logit("Writing splitup output to $splitoutfilename"); } print $SPLITOUT $_ unless /^##END/; } endEmailPackage($SPLITOUT, $splitoutfilename) if $addrectot > 0; }

    Mailer code loop

    open(INFILE, "< $infile") or alert("$!"); logit("Opening $infile for reading"); my $datequal = strftime('%m%d%C%y%H%M%S', localtime()); my $ofilename = "$hopt{TPID}.$hopt{BATCHID}.$datequal.txt"; my $ofilepath = "$outdir/$ofilename"; open(my $AFILE, "> $ofilepath") or alert("$!"); logit("Opening $ofilepath for writing"); while (<INFILE>) { writeAFileOut($AFILE, $_); } close $AFILE; compressAFile(); if (deliverAPackage()) { sleep 2; my $rc; $rc = system("mv $infile $arcdir"); logit("Return code of $rc after move of $infile to $arcdir"); my $bfile = basename $infile; $rc = system("/usr/contrib/bin/gzip $arcdir/$bfile"); logit("Return code of $rc after gzip of $arcdir/$bfile"); unlink($ofilepath); } sub compressAFile { logit("Compressing $ofilepath"); my $gziprc = system("/usr/contrib/bin/gzip -f -n $ofilepath"); logit("Return code $gziprc after gzip of $ofilepath"); alert("Unable to compress $ofilepath") if ($gziprc); $ofilename = $ofilename . ".gz"; $ofilepath = "$outdir/$ofilename"; } sub deliverAPackage { my $templatefile = "$templatedir/$hopt{EDITYPE}"; alert("Failed to load template $templatefile") unless (-e $templat +efile); my $body = `cat $templatefile`; $body .= "\n\n"; $body .= "Effective Date: $hopt{DATE} \n" if ($hopt{DATE} =~ /\S+/ +); $body .= "Admin: $hopt{ADMIN}\n"; $body .= "Email: $hopt{ADMEML}\n\n"; $body .= `cat $defaulttemplate`; $subject = "$subject - $hopt{TPID}"; my $mailrc = sendEmail($hopt{EMAIL}, $subject, $body, $ofilepath, +$hopt{FILENAME}, $hopt{EXT}); return $mailrc; } sub scrubHeaderOpt { my ($href) = @_; foreach (keys %{$href}) { $href->{$_} =~ s/^\s+//; $href->{$_} =~ s/\s+$//; } $href->{EDITYPE} = substr($href->{ID}, 0, 3); $href->{EDITYPE} .= $href->{TYP} if $href->{TYP}; $href->{TPID} = substr($href->{ID}, 3); } sub writeAFileOut { my ($OFILE, $data) = @_; return if ($data =~ /^##ADD/ && $removeaddsw eq 'Y'); return if ($data =~ /^##END/); $data =~ s/\n/\r\n/g; print $OFILE $data; }

    Bottom line is if the system call to MailPush is omitted in endEmailPackage() and instead a ksh simply loops through all the files in a directory, it runs like a charm. Like this, and memory is slurped so hard, the "top" command fails for "not enough memory"...

    Also, the existing process had both functions together and for input files over 10M, at some point, system("gzip...") and system("mv...") calls would fail with -1 return code, again memory leak. Problem was alleviated somewhat when I replaced system("rm...") with unlink, but still will pop up intermittently and especially on >100M input files.

      ... if the system call to MailPush is omitted ... it runs like a charm.

      Sounds like a problem with MailPush, then, not with gzip or mv. My guess is that MailPush is trying to be helpful by daemonizing itself to send the mail and returning immediately. Then it up and loads the whole file into memory before mailing it.

      Try replacing MailPush with cat $splitoutfilename > /dev/null or something similar. If that works, try replacing it with Mail::Mailer or something similar. If that works, you're done! If it doesn't, use top to investigate while parsing a smaller file, one that doesn't completely hose the system.

        >>Sounds like a problem with MailPush, then, not with gzip or mv. My guess is that MailPush is trying to be helpful by daemonizing itself to send the mail and returning immediately. Then it up and loads the whole file into memory before mailing it. No because I can string all those calls to MailPush in a ksh for loop and memory usage never goes above 8M whereas the Splitter just leaks memory out the wazoo when using "system" to invoke the process...