Workplane has asked for the wisdom of the Perl Monks concerning the following question:

All,

My problem is that when I run my programme the free memory on the workstation reduces and does not increase again even after the run has completed. Is this normal?

Background: I have a perl programme running on Solaris: $ uname -a

SunOS soc08433 5.10 Generic_127111-11 sun4u sparc SUNW,Sun-Blade-2500

The programme reads a hash which contains a number of filenames. It then checks those filenames for: existance, and looks for certain lines in them.

There are approximately 45000 first level keys in the hash and it is 3 levels deep.

There are approximately 800000 occasions where I look at a file, either to check existance or to look for a string in it. The files are all on NFS.

If I run the programme repeatedly, the amount of free memory lost on each run decreases, but after a while the workstation becomes unusable and I have to reboot it (and free goes back to 3.5G or something).

The workstation has 4G of physical memory, and a run of the programme can reduce that by as much as 1.2G.

I am determining free memory on the workstation with "vmstat 2 2".

I run vmstat before and after the programme and take the "free" amount from the second line.

If I run it continuously this looks to be correct. The figures from vmstat tie up with the numbers from "top".

So what appears to be happening is the memory is being used by my programme and is not being made available again even after the programme has exited.

What does the programme do:

The programme opens and closes files (there were situations where I wasn't closing files, I think they are all fixed now).

It reads from the files and compares the contents to known standards.

It stores results in the main hash. It writes an output file, but only once so I don't think this is relevant.

I've tried running only say the first 5000 first level keys, or not running all the tests on each key, but because the problem is a bit elastic, it is hard to determine a pattern. And each run can take a couple of hours.

So the question is:

Is this a memory leak (I don't think so because it persists after the programme has exited).

If it isn't a memory leak, what is it (so that I can search for answers).

What could be causing this? (without seeing my lines of poor code)

Is there anything that I could call near the end to prevent this?

Are there any useful diagnostics that I can run to track this down?

perl -Version produces:

Summary of my perl5 (revision 5 version 8 subversion 4) configuration: and another 70 lines.

I can't change the version of perl or Solaris that I have!

I look forward to your help!

Cheers

Geoff.

Sorry about the formating, I don't seem to have got the hang of that yet!

  • Comment on memory not freed after perl exits. Solaris.

Replies are listed 'Best First'.
Re: memory not freed after perl exits. Solaris.
by eyepopslikeamosquito (Archbishop) on Nov 23, 2010 at 11:13 UTC

    A user-level program, whether written in Perl or any language, should not cause the operating system to leak memory after it exits. I'd speculate that something your program is doing is triggering a kernel bug, or, more likely, a kernel module bug.

    Sound OS primitives (e.g. file handles) should automatically be cleaned up by the OS after the program exits (even if it crashes). Admittedly, there may be a few unsound OS primitives (e.g. the horrible old System V semaphores) that don't get automatically cleaned up on program exit.

    Update: if you're able to run your program on a local disk (rather than NFS) that would be useful in that if it does not leak with local files only, that would suggest a fault with your NFS system software.

Re: memory not freed after perl exits. Solaris.
by jethro (Monsignor) on Nov 23, 2010 at 11:16 UTC

    If the script ends, the operating system frees the memory. So either your OS is corrupted or more likely your script never ends and stays as a defunct process. Try ps -aux to see all processes, top shows you only a part.

    Other interesting OS commands you might have on your solaris: truss to trace the OS calls of your script, /usr/proc/bin/pmem to list memory usage, /usr/proc/bin/pfiles to list open files

Re: memory not freed after perl exits. Solaris.
by salva (Canon) on Nov 23, 2010 at 12:27 UTC
    Your OS may be using the missing memory as file cache.

    Once the process exits, the OS frees all the memory allocated by it. AFAIK, the only way to have some allocated memory survive the program is to use shared named memory

    Perl will not do that by itself. You will have to request this kind of memory explicitly from your program or use some module that does it.

      Agreed. Asking the OS for the amount of free memory is mostly useless, since the OS might decide to use large portions for buffering. If you want free memory, don't put it into the computer.

      A better test is to start a program that needs much memory. If it exceeds the amount of "free" memory, the OS starts to give up file caches, and the program start succeeds even though it needs more memory than what was reported as "free".

      Are you storing data on /tmp, and is it mounted as tmpfs?

      --MidLifeXis

Re: memory not freed after perl exits. Solaris.
by cdarke (Prior) on Nov 23, 2010 at 14:21 UTC
    How do you terminate your process? I have seen many people think that <CTRL>+Z terminates a process, whereas in fact it suspends the process and moves it into background (depending on terminal settings and the shell). Type jobs to see any background jobs that are running or suspended.
Re: memory not freed after perl exits. Solaris.
by Illuminatus (Curate) on Nov 23, 2010 at 15:57 UTC
    From your 'uname' output, you are running on Solaris 10. There is virtually no way that this is an OS problem. There are really only 3 possibilities:
    1. your program is not really ending, or becomes a zombie. This has already been discussed at length.
    2. Your program is interacting with another process on the system, and that process is sucking up the memory
    3. (highly unlikely) Either your program or another process is doing some sort of persistent memory allocation that is not being freed. shared memory is the only thing I can think of here
    (1) and (2) can be checked by calls to 'ps aux' before and after runs of your program. The process sizes are included in this list. It's usually not hard to spot the memory hogs. In the unlikely event of (3), you can check with 'ipcs -m'

    Update: I thought of another, less unlikely instance of (3). ramdisks (or tmpfs filesystems) can also suck up memory. 'df' will show these

    fnord

      And even if there are zombies, it shouldn't take up much memory. Only thing a zombie needs is its exit value - which is a single byte. There will be a bit more overhead than a single byte, but a zombie should not hold on to memory it created in userland. You're more likely to run out of PIDs than out of memory due to zombies.
Re: memory not freed after perl exits. Solaris.
by Anonymous Monk on Nov 23, 2010 at 11:16 UTC
      You can use the "top" command to check for zombies, in case your program is leaving orphaned processes around once it exits - just run it before and after, and see if the zombie count increases.
        zombie processes are reaped by the init process as soon as the main process exits so in practice it is impossible to have orphaned zombie processes!
Re: memory not freed after perl exits. Solaris.
by bart (Canon) on Nov 23, 2010 at 11:48 UTC
    Are you interpreting the data correctly? Perhaps the free memory isn't decreasing, but just more fragmented.
Re: memory not freed after perl exits. Solaris.
by BrowserUk (Patriarch) on Nov 23, 2010 at 13:51 UTC

    Whether this will work on Solaris in the same way that it does in Win I have no idea, but it's worth a shot.

    I noticed that if I've run a program that does a lot of heavy IO, the free memory remains low long after the program has ended due to high file system cache usage. And it can take the system many minutes before it gets around to flushing everything back to disk.

    I found that if I run a process that utilises a lot of VM, then that forces the flush much more quickly.

    On my 4GB system, after running a simple perl -E"$x='x' x 2**31", the free memory is close to its maximum and system cache usage almost zero.

    Using perl -E"$x='x' x 2**30" isn't quite as fully effective, but runs a heap more quickly.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: memory not freed after perl exits. Solaris.
by anonymized user 468275 (Curate) on Nov 23, 2010 at 11:02 UTC
    It might be an idea to post the code so we can look for anything suspicious. It seems too strange that the memory does not free up. I have used Sun a lot and never seen that before.

    One world, one people

Re: memory not freed after perl exits. Solaris.
by locked_user sundialsvc4 (Abbot) on Nov 23, 2010 at 15:11 UTC

    I can’t speak for Solaris, but is it possible that the computer is simply holding things around in-memory, hoping that you will reference the same thing again soon?   If you wait a couple minutes, does the memory allocation figure gradually recede on its own?

    (You see the effect of this “lazy” behavior when you run the same program twice in succession:   it starts much faster the second time, since most of the code/data is still there.   Given the number of times Unix systems tend to invoke the same programs (e.g. in a Shell script...), that would be a very big deal.

Re: memory not freed after perl exits. Solaris.
by Workplane (Novice) on Nov 24, 2010 at 09:52 UTC

    People,

    Firstly THANK-YOU for all your comments.

    I half expected a "not another guy blaming memory not freeing" type response.

    There were a number of replies and it would be impracticle to reply to each so at the risk of breaking the threading, I have gathered all my responses here.

    Generally I've repeated a summary of a suggestion followed by my response.

    Now if I can sort out the formatting....



    Post the code:

    There is about 1000 lines and (almost by definition) I'm not sure which sections to post.

    Posting the lot might be asking a lot of your collective patience.


    Rerun on a local disk.

    I can get the code onto a local disk and run it.

    I only have 5 G locally (on /tmp) and I have about 0.5T of log files. I'll try to get a fragment of the log files on to /tmp and rerun.


    I'm not forking.


    Only 2 zombies after, not sure before I'll rerun and see how many after another run, but this is after several runs.


    The script ends normally, not with ^C. It prints a message right near the end and closes the output file correctly (I think!). I often run it in background ^Z, (followed by "bg") but not always. No evidence of it with ps -aux


    Free memory isn't decreasing just fragmenting.

    I don't know. I'm using vmstat as I described. Is there a better way to assess free memory. It seems to agree with the header in "top"


    Am I using "shared name memory"

    Not as far as I know, I'm letting perl handle everything in that regard. I just keep adding elements to my hash, sometimes I store it and retrieve a new one. In between I undef the hash name. Don't know if that is necessary or helps. Also I open a lot of files and read from them, I only write to 1 or 2 files. I'm pretty sure I'm closing all the files that I open (which I wasn't a while ago).


    Am I storing data on /tmp.

    I'm not storing data on /tmp but I do send the STDOUT of the programme to a log file on /tmp.

    I don't know if the nature of the /tmp filesystem,

    $ df -k .

    Filesystem kbytes used avail capacity Mounted on

    swap 7077352 1652448 5424904 24% /tmp

    Does that mean that it is of type "swap"? Sorry for my ignorance.


    run perl -E"$x='x' x 2**31 to flush memory from cache

    If I put it in a file and run, I can run **29, if I try **30 I get:

    panic: string extend at ./999-workHard.pl line 2.

    **29 takes about 4 seconds.

    If I run it from the command line:

    /usr/bin/perl -e"$x='x' x 2**29"

    syntax error at -e line 1, near "="

    Execution of -e aborted due to compilation errors.

    I don't know how to fix either the limit of **29 or the compilation error when running from command line.


    I have no background jobs running.


    ps -au username

    1 defunct process. No indication of process size.

    The name of the programme I'm running isn't in the list.


    no ramdisks on this machine


    So I wrote the OP yesterday and when I log on today the "free" in top and vmstat has increased, (from ~500m to 2.6G) so maybe it is just the file cache being released over time.

    Can someone help me with the "$x='x' x 2**31" thing? This seems like the most promising answer or is it a red-herring?





    On a completely separate note, how to I do the formatting sensibly.



    Thanks again to everybody.




    </BODY> </HTML>

      <p>[Workplane]:</p>

      <p>With respect to your comment on formatting help. Hopefully, this can serve as an example. But don't forget the handy links at the bottom of the composing window showing the HTML tags, etc.</p>

      <p>As to your posting a large amount of code. The standard advice I give here is to make a copy of your program, and hack out a large chunk that you really don't believe could be affecting the problem. Then run it. If it truly didn't affect the problem (i.e., you're still losing memory), then make another copy, remove another chunk of code, etc. Eventually, you'll find that the last chunk you deleted caused the problem, and you can examine it in more detail. It also gives you a smaller program to post and ask questions about.</p>

      <p>Yes, it can be a bit time consuming, but I expect you're already at the point where living with the problem is also painful. Give it a try and see what you can do with it.</p>

      <p>...[roboticus]</p>

      Can someone help me with the "$x='x' x 2**31" thing?

      Try:

      perl -e '$x="x" x 2**31'

      All I've done is switch "s for 's and vice versa. That should now run on your Solaris system.

      But you said that you'd got it to run by putting it into a script rather than running it as a one-liner. Did it have the desired affect?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Thanks for the change to quotes. That has made it get further.

        But the result is then the same as before (when I saved it in a file and ran it from there)

        I can run up to **29, but not **30 or **31

        **29 takes about 4 seconds.

        $ perl -e '$x="x" x 2**31' panic: string extend at -e line 1. $ perl -e '$x="x" x 2**30' panic: string extend at -e line 1. $ perl -e '$x="x" x **29' $

        So why can't I run **30 or **31?

      Am I storing data on /tmp.
      I'm not storing data on /tmp but I do send the STDOUT of the programme to a log file on /tmp.
      I don't know if the nature of the /tmp filesystem,
      $ df -k . Filesystem kbytes used avail capacity Mounted on swap 7077352 1652448 5424904 24% /tmp
      Does that mean that it is of type "swap"? Sorry for my ignorance.

      and

      no ramdisks on this machine

      Actually, you do have a ramdisk on the machine. swapfs / tmpfs / ramdisk are functionally the same thing. They will take blocks from your vm system to use for a disk interface to the system.

      Is the size of the logfile in your /tmp filesystem the same as the amount of memory that you are seeing your system go down by?

      --MidLifeXis

        so what I understand from what you wrote:
        by sending the log file (STDOUT) to /tmp I am forcing the system to use memory to hold the log file?
        Is that correct?
        The log file is often larger than the amount of memory which is no longer free (often larger than the system physcial memory)
        Log file may be up to 8G
        Physical Memory = 4G
        Memory "lost" on one run ~ 300M to 2G
        But why does the memory stay lost (not free)?
        What can I do to get it back?
        Shouldn't it be given back when the programme exits and the log file is written to disk (flushed?).

        I log in via a Citrix connection over a (relatively) slow link, so I can't have gigabytes spewing out to an xterm. Sending it to a file on /tmp seemed a good way to keep the output for debugging purposes.

        Would I be better off sending the STDOUT to a file on the NFS?
        That seems like an odd thing if so!

      People,

      OP again. I know not threaded, but so many replies to whom should I reply?

      Thanks. Collectively you seem to have fixed it for me.

      I can't be sure exactly which was the fix but I think the problem was sending the (large) log files to /tmp which it appears on my system to consume memory/swap.

      I've also tried to reduce the log file size.

      I can't be sure exactly what fixed it because a round trip involved rebooting the box which I can't do (I'm remote from it) so I tried all your suggestions in one go and got a successful result. Not very scientific I know.

      Thanks again.

      Geoff.