in reply to Out of Memory using Mail::Mbox::MessageParser

Also looking at this code only-superficially, I am healthily suspicious of the use of recursion here:   the fact that I see process_mbox calling scan_folder.

I would, to start, add quite a few print STDERR some-message statements throughout the code so that you can observe the recursion that might be occurring now.   It could be that at some point it is recursing endlessly or too deeply.   In any case, it could be holding on to far more resources than you realize.

Therefore, I would suggest redesigning this routine so that it is non-recursive, working instead from a “to-do& list.rdquo;   The list starts by containing only the root folder(s), and as subfolders are encountered their fully-qualified pathnames are added to the list but those folders are not processed at this time.   The main loop therefore processes only one mailbox at a time.   If the processing of a mailbox is done by a sub with local variables, the relevant objects will constantly be being disposed-of, and Perl’s memory-manager will keep the place tidy.

If a hash structure is used to maintain the to-do list (folder is key, true/false indicates whether it has been processed yet), any possibility of loops within the logic caused by any loops within the structure would be eliminated:   the algorithm would be able to know if it has ever seen this key before (exists() ...).   Don’t attempt to keep resources (other than the connection) open ... let them be re-created each time.

I suspect that this non-recursive approach will be much simpler, easier to debug, and will use less resources.

Replies are listed 'Best First'.
Re^2: Out of Memory using Mail::Mbox::MessageParser
by u65 (Chaplain) on Jun 19, 2015 at 12:43 UTC

    Good ideas.