in reply to Strange memory leak question. Please help!

I ran into a similar, seemingly unavoidable problem with memory consumption when I was facing a huge number of Excel files, and decided to use Spreadsheet::ParseExcel to normalize/condense/combine the data from all of them. For each new Excel file that I opened, read, processed and closed, the module just kept taking up more memory, instead of re-using the space that was allocated for a previous file.

I decided to do a work-around, whereby I would process files until some reliable event occurred (e.g. changing directory, because there were never too many files in a single folder), write a "checkpoint" file to indicate how far I had gotten in the overall list, and exit. On start-up, the script would read the checkpoint file to figure out which directory to do next. Then it was just a matter of putting the script in a shell loop, running it enough times to cover the whole set.

In your case:

Either way, most of your trouble comes from trying to do too much in one huge monolithic script. Break it down into simpler components -- that's likely to improve performance in a lot of ways, and will make it easier to maintain; it's a win-win approach.

Replies are listed 'Best First'.
Busted: Strange memory leak question. Please help!
by catsophie (Initiate) on Sep 22, 2007 at 05:19 UTC

    Thank all for quick helps. Bellow is my report on the question.

    talexb, I suspect my Perl program consumed my memory by using 'free -m' to look at the free memory. When I ran the Perl program, free memory decreased very fast and did not release after the Perl program stopped.

    Fletch, you got the point. I forgot to delete the tree. Since I called HTML::TreeBuilder many times, that caused a serious memory wastage. After I deleted the tree, the memory leaking was almost solved.

    When I say 'almost', I mean there is still very slow memory leaking, like 1M bytes several minutes. graff is right, the trouble comes from my large script (1305 lines :P). I should break the script into smaller components.

    I didn't try Devel::Cycle and Test::Memory::Cycle, since I did not have complex reference structures.

      That is not a good way to detect a memory leak. You are looking at the total free memory on the system, which could go up or down due to pretty much anything happening on that box. It only worked for you because the leak was so large.

      Far better would be to find the pid of your process and run ps l on it periodically. Look at the VSZ column. If it never changes, then you don't have a leak.

        Thank sfink, I got it!