jthomas has asked for the wisdom of the Perl Monks concerning the following question:

Hi experts, Have you ever experienced Out of memory problem while using HTML::TableExtract. I'm having little large html files, still i didn't expect this to happen Would you be able to suggest some workarounds for this. I'm using this subroutine in another for loop.

sub zParseHTMLFiles ($$) { my ( $lrefFileList, $lrefColNames ) = @_; my @ldata; foreach my $lFile (@$lrefFileList) { my $lTableExtract = HTML::TableExtract->new( headers => [@$lre +fColNames] ); chomp($lFile); $lTableExtract->parse_file($lFile); foreach my $ls ( $lTableExtract->tables ) { foreach my $lrow ( $lTableExtract->rows ) { chomp( @$lrow[$#$lrow] ); push( @ldata, $lrow ); } } } return \@ldata; }

Replies are listed 'Best First'.
Re: Out of memory while using HTML::TableExtract
by GrandFather (Saint) on Jan 06, 2011 at 08:16 UTC

    Maybe if you are tossing so much data around that you are running out of memory you should be storing the data in a database? SQLite (accessed using DBD::SQLite) is easy to get going and to use for this sort of purpose.

    True laziness is hard work
Re: Out of memory while using HTML::TableExtract
by Anonymous Monk on Jan 06, 2011 at 08:17 UTC
    I would add
    eval { $lrow->detach; };
    and maybe this code Re: HTML::TableExtract Memory Usage, at least until HTML::TableExtract incorporates the idea

    Btw, what is chomp( @$lrow[$#$lrow] ); supposed to do? Apparently in modern perls $#$lrow is slower, but -1 might be faster

Re: Out of memory while using HTML::TableExtract
by djerius (Beadle) on Jan 06, 2011 at 20:12 UTC
    You might try HTML::TableParser, which streams the data rather than reading it all into memory.