Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

'Ola, monks.

I've been having a terrible time using HTML::TableExtract to extract a column from an HTML file using the 'column($col)' subroutine. I'm entirely baffled by this, so I've decided to throw in the towel and let some real professionals take a stab at it.

No matter how I use 'column($cols)', I always get a 'row ARRAY(...) out of range (0)' message. I took a peak at the source for HTML::TableExtract, but couldn't determine what's going wrong. Could some benevolent monk show me the way?

Relevant links, for the lazy:
#!/usr/bin/perl use HTML::TableExtract; use WWW::Mechanize; use Data::Dumper; use strict; use warnings; my $sensational = WWW::Mechanize->new( autocheck => 1 ); $sensational->get('http://www.drudgereport.com/'); chomp(my $html = $sensational->content); my $table = HTML::TableExtract->new(); $table->parse($html); #$table->tables_dump; my $t = $table->first_table_found; #print $t->cell(0,1); # works fine # each of these generate a 'row ARRAY(...) out of range (0)' message #print $t->column(1), "\n"; #print for $t->column(1); #print Dumper $t->column(1); $t->column(1);

Replies are listed 'Best First'.
Re: HTML::TableExtract woes
by shmem (Chancellor) on Dec 16, 2007 at 15:01 UTC
    File a bug report.
    sub column { my $self = shift; my $c = shift; my @column; - foreach my $row ($self->rows) { + foreach my $row (0..$#{$self->rows}) { push(@column, $self->cell($row, $c)); } wantarray ? @column : \@column; }

    Method rows() returns an anonymous array or list of rows (arrayrefs), but the cell method works with indices.

    update: added link

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}

      I just sent a bug report via email; I included the text of this node (as well as links to it). Thanks for lending your keen eye, shmem.

Re: HTML::TableExtract woes
by Sixtease (Friar) on Dec 16, 2007 at 14:13 UTC
    I can only confirm observing the same behavior. The only thing that comes to mind is bypassing it by saying
    @columns = $t->columns(); print Dumper $columns[0];
    which seems to work for me. (didn't check the correctness of the data printed though)
    use strict; use warnings; print "Just Another Perl Hacker\n";