Disclaimer: This module is provided only for backwards compatibility with earlier versions of this library. New code should not use this module, and should really use the HTML::Parser and HTML::Treebuilder modules directly, instead.
$c->{data} = $stripper->format(parse_html($c->{data}))
Apart from that, I wouldn't know whether the memory leak is due to the "unfinished" query statement objects, as suggested by others above, or whether it's due to stranded (non-garbage-collected) objects in the HTML parsing/formatting modules.
One way to tease that apart would be to divide the process into two distinct steps (two processes): step/process 1 is to parse the html data and output tab-delimited flat tables that can then be inserted into your database by any of several easy methods. If that process succeeds, you can conclude that it was the database interaction that caused the leak.
In any case, a better way to load large quantities into mysql tables is via the "mysqlimport" tool that comes with mysql -- it's incredibly fast compared to using Perl/DBI for inserts, and it's the best/easiest way to load a table from a tab-delimited flat-text file. (Rather, Perl/DBI is incredibly slow relative to mysqlimport.)
Another idea: since you are looping over two main directories, you might try just doing one directory per run (giving the desired path name on the command line). If you really want to do both in one run (after solving the memory leak issue), a better loop method would be:
instead of that clunky while-loop.for my $path ( 'modified', 'deleted' ) { ... }
In reply to Re: Massive Memory Leak
by graff
in thread Massive Memory Leak
by martin_ldn
For: | Use: | ||
& | & | ||
< | < | ||
> | > | ||
[ | [ | ||
] | ] |