That's the stupid "lalala I can't hear you" method. The problem is still there. Did you bother looking at line 1229 of C:/Perl64/site/lib/HTML/TableExtract.pm?
The current version 2.10 (dated 15 Jul 2006) at cpan.org has the following code around that line:
sub strip {
my $self = shift;
$self->parse(shift);
$self->eof;
$self->{_htes_tidbit};
}
Line 1229 is the call to the parse method. My guess is that strip is called as method without arguments or with an undefined argument.
In line 1196, you find "package HTML::TableExtract::StripHTML;", and in line 1201, you find @ISA = qw(HTML::Parser);. So, there is a helper class HTML::TableExtract::StripHTML inheriting from HTML::Parser. Does HTML::Parser implement a strip method that may be called without a defined argument? Or is the strip method called by HTML::TableExtract?
HTML::Parser does not document a strip method. But a simple and stupid search inside TableExtract.pm for the exact word "strip" shows two matches, the sub starting in line 1227, and a strip method call in line 625, "$target = $stripper->strip($item);". And lo and behold, the previous line 624 creates an instance of HTML::TableExtract::StripHTML: "my $stripper = HTML::TableExtract::StripHTML->new;".
So, strip is always called with an argument ($item), but that argument may become undefined. Looking up a few lines, you can see that $item is initialised by dereferencing $ref as a scalar, in line 622. Scalars can be undef, and no code around checks that condition. Looking up a few more lines, you can see that $ref is initialised from my $ref = $self->{grid}[$r][$c]; in line 617.
My guess (from variable names and the ROW: label in line 612) is that this part of the code iterates over a 2D-array in $self->{'grid'} that represents one or more HTML tables. But HTML tables may have empty cells (<td></td>), missing cells (at the end of table rows), and cells spanning rows and/or columns (rowspan and colspan attributes). Perhaps the HTML::TableExtract code can't handle those one or more of these conditions, and the array element becomes undefined.
Look at the HTML you feed to HTML::TableExtract. Is it invalid HTML? (Use the W3C validator to find out.) Does it contain empty table cells? Does it contain cells spanning rows and/or columns?
If the input is not valid HTML, fix it. If missing cells cause the problem, fix the input. If empty or spanning cells cause the problem, search for an existing bug, report a new bug if the problem is not yet known.
Alexander
--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
|