downer has asked for the wisdom of the Perl Monks concerning the following question:

I am running perl code, computing some statistics based on lines of text. Each line of text is rather small, and as I watch the program progress with top (fedora) it never uses more than 4% of system memory. Without fail, after maybe 5 minutes, the program suddenly ends with a Out of memory! error. I have completely modified the input data in hopes of doing most of the processing beforehand, and have re-written the code several times in hopes of making it lighter. The code is now totally ugly and the error persists.
#!/usr/bin/perl while(<>) { chomp; @pages = split(/\+\+/, $_); @SonarData = (); %DOWN = (); @inlevels = (); @outlevels = (); @pagelevels = (); %OUT = (); %URL = (); %NUMOUT = (); %LEVEL = (); $numpages = @pages; $intop = 0; $intotal = 0; $outtotal = 0; $inavg = 0; $outavg = 0; foreach $x (@pages) { ($id, $url, $level, $numout, $numin, $out, $in) = spli +t(/\*\*/,$x); $pagelevels[$level]++; $inlevels[$level] += $numin; $outlevels[$level] += $numout; $OUT{$id} = $out; $URL{$id} = $url; $DOWN{$id} = 0; $NUMOUT{$id} = $numout; $LEVEL{$id} = $level; $intotal += $numin; $outtotal += $numout; $inavg += $numin*$level; $outavg += $numout*$level; } @pages = (); if($intotal) { $SonarData[6] = $inlevels[1]/$intotal; } else { $SonarData[6] = 0; } $total = 0; $numlevels = @pagelevels; $max = 0; $x = 0; while($x < $numlevels) { $total += $pagelevels[$x]*$x; if($pagelevels[$x]>$max) { $max = $pagelevels[$x]; } $x++; } $SonarData[0] = $total/$numpages; $SonarData[1] = $max/$numpages; if($pagelevels[1]) { $SonarData[2] = $pagelevels[2]/$pagelevels[1]; } else { $SonarData[2] = 0; } $SonarData[3] = $intotal/$numpages; $SonarData[4] = $outtotal/$numpages; if($intotal) { $SonarData[5] = $outtotal/$intotal; $SonarData[8] = $inavg/$intotal; } else { $SonarData[5] = 0; $SonarData[8] = 0; } if($outtotal) { $SonarData[9] = $outavg/$outtotal; } else { $SonarData[9] = 0; } $SonarData[10] = $SonarData[9] - $SonarData[8]; if($intotal) { $max = 0; foreach $x (@inlevels) { if($x > $max) { $max = $x; } } $SonarData[11] = $max/$intotal; } else { $SonarData[11] = 0; } if($outtotal) { $max = 0; foreach $x (@outlevels) { if($x > $max) { $max = $x; } } $SonarData[12] = $max/$outtotal; } else { $SonarData[12] = 0; } @uplevel = (); @downlevel = (); @crosslevel = (); @sidelevel = (); $totalcross = 0; foreach $start (keys %OUT) { $startLevel = $LEVEL{$start}; $startURL = $URL{$start}; @links = split(/ /, $OUT{$start}); foreach $end (@links) { if($start != $end) { $endURL = $URL{$end}; $endLevel = $LEVEL{$end}; if($startLevel == $endLevel) { $sidelevel[$startLevel]++; } elsif($startLevel < $endLevel && $star +tURL =~ m{\Q^$endURL}) { print STDERR "downlink\n"; $downlevel[$endLevel]++; $DOWN{$start} = 1; } elsif($startLevel > $endLevel && $endU +RL =~ m{\Q^$startURL}) { print STDERR "uplink\n"; $uplevel[$endLevel]++; } else { # print STDERR "crosslink\n"; $crosslevel[$endlevel++]; $totalcross++; } } } } @tmp = keys (%DOWN); $total = 0; $count = 0; foreach $x (@tmp) { if(!$DOWN{$x}) { #a leaf page $count++; $total += $NUMOUT{$x}; } } if($count) { $SonarData[7] = $total/$count; } else { $SonarData[7] = 0; } $SonarData[13] = $totalcross/$numpages; $N = 0; $S = 0; for($x = 2; $x <= 4; $x++) { $N += ($uplevel[$x] + $sidelevel[$x] + $downlevel[$x]) +; $S += $crosslevel[$x]; } if($S) { $SonarData[14] = $N/$S; } else { $SonarData[14] = 0; } $SonarData[15] = ($uplevel[1]+$sidelevel[1]+$downlevel[1]+$cro +sslevel[1])/$numpages; $ print "@SonarData\n"; }
sorry, i know the code is hideous, i've re-written it from scratch at least 4 times, its really wearing on my nerves. are there any common problems that lead to this kind of error?

Replies are listed 'Best First'.
Re: out of memory! (again!)
by andreas1234567 (Vicar) on Sep 04, 2007 at 18:32 UTC
    Using strict and warnings would certainly make your life (as a programmer) a little bit easier:
    use strict; use warnings; while(<DATA>) { chomp; my @pages = split(/\+\+/, $_); my @SonarData = (); my %DOWN = ();
    C:\src\perl\perlmonks\636961>perl -wc 636961.pl "my" variable @pages masks earlier declaration in same scope at 636961 +.pl line 17. String found where operator expected at 636961.pl line 220, near "$ p +rint "@SonarData\n""
    Update: Your script is littered with small, but subtle errors that are easily detected using strict and warnings:
    • A $ before print, probably just a typo.
    • Redefined variables, "my" variable $total masks earlier declaration in same scope.
    • Case errors: Global symbol "$endlevel" requires explicit package name.
    • Wrong use of increment as pointed out by rhesa, would be reported by warnings: Useless use of array element in void context when $endlevel is corrected to $endLevel.
    Once you get your script to do what you want, take a step back and consider refactoring (perl.com article).
    --
    Andreas
Re: out of memory! (again!)
by salva (Canon) on Sep 04, 2007 at 18:17 UTC
    check the array indexes you are using:

    Something similar to:

    $a[2**29] = 1;
    causes an out of memory error (on most 32bits systems).
      based on that hint, line 169:
      $crosslevel[$endlevel++];
      should probably read:
      $crosslevel[$endlevel]++;