Prime,
Can you provide a little more information about the contents of the Bench file, and what print_report() is doing?
From what I gather, the bench file is simply a list of absolute file paths on the filesystem (since you're using a find call to populate %today). What exactly are you trying to track?
Another question - have you confirmed your find command on your machine? On my box (redhat 9), that call to find (assuming $search_files is a scaler for a text match of some kind) would return every file on the filesystem. Are you sure you're getting the correct results?
Now that I think about it, I've got an idea on a general approach, assuming you've got access to the standard Unix utils - use sort, uniq, and diff, and parse the output of the diff. e.g.
`cat benchmark_files|sort|uniq -c > benchmark_counted`; `find / $search_files -print |sort | uniq -c > todays_find`; open IN, "diff benchmark_counted todays_find|" or die "$!"; while (<IN>) { ## parse diff output into %yesterday and %today ## an exercise for the reader } close IN;
By using the unix tools, you've now got the same output as you had after the call to _scan_system(). Note - diff will flag identical lines with different counts (that's what the -c option to uniq does) - you'd have to account for that when parsing the diff output.
This assumes, of course, that the real memory hog is %yesterday, before a pile of keys are deleted in building %today. If I'm wrong, and at the end of processing %yesterday and %today are both too big to handle by print_report(), you may well need to look at some kind of BerkeleyDB-type solution, but realize it's going to slow things down by a lot.
I hope this helps - sort/diff/uniq can be a great way to reduce the load on perl when processing large files.
In reply to Re: Memory Management Problem
by swngnmonk
in thread Memory Management Problem
by PrimeLord
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |