Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

hello monks, i currently have a little problem with the garbage collection (?) which I somehow can not really understand

I do prepare a rather huge hash (some GBs). then I start several (8) threads (fork) doing alot of reading and abit of writing on this hash. because of copyOnWrite, perl basicly creates already a copy of the hash just by reading it. anyway, its not a real problem while the threads are running, but when I want to exit() the threads, then it takes ages and the server even starts swapping.
I guess it is the garbage collection kicking in...but why would it take temporarily so much memory (several times as much it needed during running) that the server starts swapping?

So my question is: what is really happening here and what can I do to avoid swapping and improving performance? pseudo code:

create_huge_hash() fork() child_thread: read_and_write_hash(); exit();
problem: when it reaches exit(), then the server waits ages and during that time it uses more memory then before and sometimes starts swapping

thanks ahead for your answers :)

Replies are listed 'Best First'.
Re: problems with garbage collection
by BrowserUk (Patriarch) on Jul 13, 2010 at 19:04 UTC

    Copy on write happens in pages. So whilst you are running, if your child processes (please don't call them "threads"), are each only accessing small subsets of the hash, then only small parts are replicated. But when clean up occurs, every scalar has to have its reference count decremented to 0 before it can be released, and that means that all of the hash has to be replicated for every child before it can complete.

      But when clean up occurs, every scalar has to have its reference count decremented to 0 before it can be released

      Actually, during "global destruction", most reference counts are not decremented. Lots of allocated memory isn't free()d either. Perl makes sure to call any DESTROY()s that need to be called. But most ordinary data structures it tries to just leave them allocated and let the act of the process exit()ing efficiently de-allocate everything in one fell swoop rather than making thousands of calls to free() and decrementing every reference count of every item until they all reach zero.

      But your explanation might be quite correct for the case being discussed. It certainly seems to fit.

      Perhaps the performance could be improved by ensuring that the large hash is not destroyed until the global destruction phase has begun. Though, given the vague code provided so far, I don't see why the hash wouldn't live that long. The OP should probably get to work investigating or just providing more details (like producing a minimal test case that reproduces the problem).

      You can also avoid Perl's clean-up phase entirely via POSIX:

      use POSIX '_exit'; _exit();

      instead of using exit.

      BTW, if that simply solves the problem, I do hope the OP will still provide more information so we can reproduce the problem and properly understand it.

      - tye        

        Hello, thanks you all for the help. the POSIX::_exit(0) solution definitly helps and makes this part of the software MUCH faster and doesn't start swapping when it did before. I guess I would never have found that solution. I still have to test, how far we can go with the current hardware before we max out and if I can find some things to optimize the ram usage

        Sorry, don't know how to make a small testcase. maybe it is helpfull when I describe the software abit. I currently have to make some old software working parallel to reduce the absolute time needed to work through a certain amount of datasets and make better use of the current hardware.

        we have lots of datasets in several databases. now we get a new version of the datasets in csv files and want to compare them with the old (via fingerprint) to find changes. if changes happen, we write output exportfiles for other software and change the values in the databases

        for this the parent process loads the fingerprint values (and other data) into a huge hash. now we fork as much processes as we have processors. because of the copyOnWrite in perl we already load a lokal copy for each process when just reading the original hash the amount of needed RAM explodes... :(
        the current hardware is a 16 core system with 24GB of RAM and 8GB of swap (don't kill me, i am not the sysad behind the swap size decision;))
        the largest examples we have go to about 3.5mio datasets, before the _exit() solution we already got problems with about 1.5 mio datasets.

        I now have to find out how high we can go and if there are still ways to optimize the software abit, so we can use the software even for the largest amount of datasets. but thats some work for tomorrow :)

Re: problems with garbage collection
by Corion (Patriarch) on Jul 13, 2010 at 18:15 UTC

    Maybe you're running out of "memory" much earlier, but only when cleaning up its memory, the machine needs to retrieve the swapped out data from disk, just to discard it? You can try to force an immediate exit, bypassing all END blocks and other cleanup, by calling POSIX::exit. Note that tempfiles won't get erased that way and other resources maybe also won't get released.

      thnaks for this help. seems to make things much better in certain situations :)