in reply to write hash to disk after memory limit

I have tried "undef" on the biggest hashes but this does not free memory to the OS.

You haven't explained your reasoning for wanting to "free this memory back to the OS"; but I think you are aiming for the wrong goal.

Once you've undef'd the hash, the memory it occupied will no longer be accessible to your program; thus, it's place in physical memory will quickly be replaced by any data you are still using; by exchanging them from the swap file.

That is to say: once a process moves into swapping; the memory you are using will be kept in physical ram; and the memory you are not using -- on a least recently used basis -- will be 'swapped out' to disk in the system swap file. Once there, it will have no effect on the performance of your process (or your system) unless it is accessed again at which point it needs to be exchanged with something else currently in physical ram.

So, once you've finished with your hash; you are better off letting it get swapped out to disk as the system see's fit, than you are trying to reclaim it. This is because the very act of undefing the hash, will cause Perl's garbage collection mechanism to visit every key and value (and subkey and value; and every element of those arrays of arrays) in order to decrement their reference counts and (if appropriate) free them back to the process memory pool (heap). In order to do that, it means that any parts of the hash -- every last dusty corner of them -- that may have been benignly lying dormant in the swap file for ages, will need to be swapped back in to be freed; and in the process, memory that you need to access may get swapped out; only to have to be bought back in -- swapped with the now inaccessible pages that used to contain the redundant hash -- almost immediately.

Upshot: If you cannot avoid your process moving into swapping in the first place -- and you haven't supplied enough details about the nature and size of your data for us to help you do that -- then *DO NOT attempt to free it*. Far better to let the system deal with making sure that the memory your process needs -- along with all the other running process' needs -- is available when it is needed.

And if you are concerned with how long your process takes to end -- after you've finished your processing -- because it spends ages reloading swapped out, redundant pages during final global clean up; then call POSIX::_exit() which will bypass that clean up and terminate your process quickly and efficiently.

(BUT Only do so once you've manually closed any and all output files; data ports etc. otherwise you could loose data!)

I'd also urge you to more fully describe your data -- a few sample records -- and volumes; because it is nearly always possible to substantially reduce the memory requirements; even at the cost of a little up-front performance. Avoiding moving into swapping will speed your process by 1 or 2 orders of magnitude in many cases; so it leaves a lot of scope for trading a little speed for a lot of memory and coming out a net winner.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked