in reply to Re^3: Massive Perl Memory Leak
in thread Massive Perl Memory Leak

Right now I'm reverting to rebuilding the script block by block from the old version that doesn't spiral out of control. Now that version's no slouch either. It grows a good bit but it stays reasonable by Perl standards. Say 33% growth from start to completion. Most of the code is the same, and is of the same type, but whatever changed, changed the mem growth to 400%.

Speaking of garbage collection, is there any way to look into Perl to see just what when and where things are garbage collected? I've had, and still have, this suspicion that Perl is just not garbage collecting the way it should.

Replies are listed 'Best First'.
Re^5: Massive Perl Memory Leak
by chromatic (Archbishop) on Jun 12, 2007 at 21:50 UTC
    Speaking of garbage collection, is there any way to look into Perl to see just what when and where things are garbage collected?

    You can use Devel::Peek on a variable and check that its ref count is what you expect.

    I've had, and still have, this suspicion that Perl is just not garbage collecting the way it should.

    The likelihood that such a thing has hidden for thirteen years seems low, to me.

      If it were single threaded, yes. But remember threading is still Wild West territory and Perl has never been particularly concerned with memory stingyness. I can just imagine a situation where threads don't know who's supposed to be doing what with the process heap.

      The problem with checking reference counts and similar modules is that u have to know what variable ahead of time to check. If data is in a state where it "should" be garbage collected, then there's nothing to check a ref count of. When variables die Perl only guarantees that the varname is no longer accessible, not that anything in particular has been done with the SV's behind the scenes. In fact the whole Perl memory management philosophy seems too lacidasical. That doesn't work anymore when u have a massively parallel script running for a day and a half. :)

        If it were single threaded, yes. But remember threading is still Wild West territory

        That's a total red-herring. Unless you are using shared variables (and even then), there is no effective difference in running two Perl threads and running two perl processes. Each thread has its own interpreter, just as each process does. And variables allocated by each interpreter are exclusive to that interpreter.

        When variables die Perl only guarantees that the varname is no longer accessible, not that anything in particular has been done with the SV's behind the scenes.

        That's wrong also. Reference counting means that variables allocated at any given scope are returned to the memory pool as soon as you leave that scope(*). With most other GC mechanism, those variables would be sitting around gathering electronic dust, inassessible but unreclaimed, until low memory or some other extraordinary event causes the whole program to freeze while the garbage collector scans all the programs dataspace, heap and stack,(twice at least), checking to see what is lying around and if anything else is still referencing it.

        Which makes:

        In fact the whole Perl memory management philosophy seems too lacidasical.

        Just about the opposite of reality. Perl's GC mechanism, reference counting, is the most eager GC mechanism possible.

        The only time that falls down is if you are creating circular references--as I mentioned above.

        *Unless your code has passed a reference to the variables out of that scope and they have failed to let go of those references. That is, your code has done something that necessitates retention. There are a few special exceptions to do with optimisations for function lexicals also, but if they were the root of your problem, they would be known about by now.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        I can just imagine a situation where threads don't know who's supposed to be doing what with the process heap.

        Barring bugs where something scribbles on memory it shouldn't, I have a difficult time believing that threads will suddenly forget they own SVs. It's not as if you have to sweep memory pools with a refcounting system.

        If data is in a state where it "should" be garbage collected, then there's nothing to check a ref count of.

        Ref counting is how Perl decides to collect data! How do you separate the two things when the former depends on the latter?

        When variables die Perl only guarantees that the varname is no longer accessible....

        I don't know what it means for a variable to "die". Regardless, if what you said were true--if there were some connection to variable name and liveness--then I don't see how anonymous variables would work.

        ... not that anything in particular has been done with the SV's behind the scenes.

        The ref count is part of the SV structure!