Out of memory while using HTML::TableExtract

jthomas has asked for the wisdom of the Perl Monks concerning the following question:

Hi experts, Have you ever experienced Out of memory problem while using HTML::TableExtract. I'm having little large html files, still i didn't expect this to happen Would you be able to suggest some workarounds for this. I'm using this subroutine in another for loop.

sub zParseHTMLFiles ($$) {

    my ( $lrefFileList, $lrefColNames ) = @_;
    my @ldata;
    foreach my $lFile (@$lrefFileList) {
        my $lTableExtract = HTML::TableExtract->new( headers => [@$lre
+fColNames] );
        chomp($lFile);
        $lTableExtract->parse_file($lFile);
        foreach my $ls ( $lTableExtract->tables ) {
            foreach my $lrow ( $lTableExtract->rows ) {
                chomp( @$lrow[$#$lrow] );
                push( @ldata, $lrow );
            }
        }
    }
    return \@ldata;
}
[download]

Comment on Out of memory while using HTML::TableExtract Download Code

Replies are listed 'Best First'.
Re: Out of memory while using HTML::TableExtract by GrandFather (Saint) on Jan 06, 2011 at 08:16 UTC
Maybe if you are tossing so much data around that you are running out of memory you should be storing the data in a database? SQLite (accessed using DBD::SQLite) is easy to get going and to use for this sort of purpose. True laziness is hard work	[reply]
Re: Out of memory while using HTML::TableExtract by Anonymous Monk on Jan 06, 2011 at 08:17 UTC
I would add `eval { $lrow->detach; };` [download] and maybe this code Re: HTML::TableExtract Memory Usage, at least until HTML::TableExtract incorporates the idea Btw, what is `chomp( @$lrow[$#$lrow] );` supposed to do? Apparently in modern perls $#$lrow is slower, but -1 might be faster	[reply] [d/l] [select]
Re: Out of memory while using HTML::TableExtract by djerius (Beadle) on Jan 06, 2011 at 20:12 UTC
You might try HTML::TableParser, which streams the data rather than reading it all into memory.	[reply]