I have a reasonable large program that queries more than 200,000 records from a database, manipulates the results, and prints the output to a flat file. (over 50 Meg in size).

Everything was fine until I was asked to place this program onto a less powerful machine as a 'backup'. When I ran the program, Solaris complained that I had run out of memory.

So, I read, and I experimented, and read some more, and so forth. I reduced the memory consumption enough that the program still runs, added more swap space, but it still wastes a lot of time 'swapping' memory in and out. (on a sufficiently robust system it takes less than 20 minutes to run, on this workstation, it takes more than 2 1/2 hours to run)

In the process of learning, I found a system memory hog that I don't quite know how to deal with.

Here is a little program that illustrates my problem. The my $str is within the for loop to mimic my original program. I know that in this context it looks silly. During execution I monitor the system resources using vmstat 1 in another window. I used sleep to slow the process down so that I could monitor the memory usage more easily.

#!/usr/bin/perl use strict; # Loop # 1 my @outfiles = qw(d.d e.e f.f); foreach my $file (@outfiles){ print "\$file = $file\n"; open OUTFILE, ">","$file" or die "Ooops\n"; for my $i (1..50000) { if (int($i/1000)*1000 == $i) { sleep(1); } my $str = " "x300 ."\n"; print OUTFILE $str; } close OUTFILE; } @outfiles = qw(d.d d.d d.d); open OUTFILE, ">","d.d" or die "Oops\n"; close OUTFILE; # Loop # 2 foreach my $file (@outfiles){ print "\$file = $file\n"; open OUTFILE, ">>","$file" or die "Ooops\n"; for my $i (1..50000) { if (int($i/1000)*1000 == $i) { sleep(1); } my $str = " "x300 ."\n"; print OUTFILE $str; } close OUTFILE; }
In the 1st major loop, the available memory goes down by 15Meg for each file, but recovers this memory when the file is closed. In the 2nd major loop, the available memory goes down a total of 45Meg, without recovering the memory when the file is closed and re-opened in 'append' mode.

The graph of the free memory looks (roughly) like this:

420000 | | + + + + | + + + + + + + 410000 | + + + + + + + | + + + + + + + | + + + + + + + 400000 | + + + + | + | + 390000 | + | + | + 380000 | + | + | + 370000 | +
My original program requires that the file be written, (15 other programs read in that file, and I do not have the luxury of rewriting these programs) so I cannot go to a database to store this data.

I could write the file in separate 'chunks' and then cat them altogether at the end, but this seems kinda clunky and inelegant.

Does anybody have any ideas?

I've tried flushing ($|++) and format/write. Neither helped.

Thanks
Sandy.

PS: I am running Solaris 2.8 (and rehosting onto Solaris 2.7). I am using perl 5.8.2

UPDATE:

I found some more information about the problem, thanks in part to browseruk who gave me the key words to google for (file caching solaris).

Under Solaris, to minimize I/O time, it sucks the entire open file into memory. I tested my program on the machine that had sufficient memory, so without swapping, the memory consumption now seems logical. (The test files were 15Meg, and the memory consumption was 15Meg). Also, when reopening a file for appending, the initial file will be loaded into memory. This is why closing and reopening a file in append mode did nothing.

Solaris 2.8 behaves nicely when swapping. (My test was run on Solaris 2.8). It does not swap out processes if it runs out of memory for I/O. Solaris 2.7, however, is not so nice. It does swap out processes if it needs more memory for I/O. This can cause the machine to spend more time swapping then working. Because my Solaris 2.7 workstation is low on memory, it can't hold the entire file in memory, so it starts swapping. Hence my original 20min program took 2 1/2 hours on the workstation.

There is a solution, it's called priority_paging, and supposely fixes the problem (haven't tried it yet).

Reference:

http://www.princeton.edu/~psg/unix/Solaris/troubleshoot/ram.html,

http://sunsolve.sun.com/pub-cgi/show.pl?target=content/content8 and

http://www.sun.com/sun-on-net/performance/priority_paging.html

Sandy


In reply to Managing System Memory Resources by Sandy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.