resolving HTML::TableExtract error

jaydon has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I am getting the following error when running a simple script :

"Can't locate auto/HTML/TableExtract/rows.al in @INC"

Here is the code snippet:

my $te;
$te = HTML::TableExtract->new( headers => [qr/Month\s*/,
                                           qr/First\s*/,
                                           qr/High\s*/, 
                                           qr/Low\s*/, 
                                           qr/Sett\s*/, 
                                           qr/Chg\s*/, 
                                           qr/Vol\s*/, 
                                           qr/GOWAVE\*\s*/] );
$te->parse_file($sourcefile);

my ($row, $record);
open (DATFILE, ">> meg.dat") or die "Unable to open meg.dat: $!";
print DATFILE "Table:\n";
foreach $row ($te->rows) {         #code failed at this line
   $record =  join(',', @$row);
   print DATFILE $record . "\n";
}
close DATFILE;
[download]

I just installed this module and have verified that the TableExtract.pm is in a location specified by @INC, and that the rows() subroutine is defined in there. However, there is no rows.al at that location.

Can anyone tell me what this means? Did my installation not work correctly even though no failure was indicated?
PS: I'm a newbie so please be kind

Comment on resolving HTML::TableExtract error Download Code

Replies are listed 'Best First'.
Re: resolving HTML::TableExtract error by mojotoad (Monsignor) on Jul 14, 2005 at 00:34 UTC
Hi there. I'm the author of the module. Could you tell me which version it is you're using? I'm assuming it's the most recent based on what you said, but you never know. Also, merely having the location in @INC could be ambiguous if there are multple versions of the module in those directories. Examine %INC and double check that you're loading the right version. Having said all that, the error you report shouldn't really be happening in any of the versions, so I'm curious to track it down. Thanks, Matt	[reply]
Re^2: resolving HTML::TableExtract error by jaydon (Novice) on Jul 14, 2005 at 16:20 UTC
Hello Matt, I'm using version 2.02. My script was successful once I had incorporated the suggestions of crashtest and haoess. The error occured when I did not have the outer loop which examins each table found, and I was also working with a chopped up html file which was missing the `<table>` tag, although, I don't think it even got to that point. While installing the module, all tests run were successful, however some tests were skipped becasue I did not have the one module (HTML::Element if I recall right) installed Hope that helps track it down.	[reply] [d/l]
Re: resolving HTML::TableExtract error by socketdave (Curate) on Jul 13, 2005 at 20:42 UTC
Don't mean to offend, but did you 'use HTML::TableExtract'? (if you're not THAT much of a newb, I apologize in advance...)	[reply]
Re^2: resolving HTML::TableExtract error by jaydon (Novice) on Jul 13, 2005 at 20:46 UTC
Yes I did:) apology accepted...	[reply]
Re: resolving HTML::TableExtract error by haoess (Curate) on Jul 13, 2005 at 20:48 UTC
foreach $row ($te->rows) { You'll have to walk through the parsed tables first: foreach my $ts ( $te->table_states ) { foreach my $row ( $ts->rows ) { ... } } Please have a look at `perldoc HTML::TableExtract` and feel free to contact its author to provide better error messages for misuses like yours. --Frank	[reply] [d/l]
Re^2: resolving HTML::TableExtract error by jaydon (Novice) on Jul 13, 2005 at 21:18 UTC
I kind of did. The Synopsis in that documentation was where I got that code from: `# Shorthand...top level rows() method assumes the first table found in # the document if no arguments are supplied. foreach $row ($te->rows) { print join(',', @$row), "\n";` [download] I am probably misinterpretting what it says, but I took that to mean that I don't have to examine all matching tables with an (outer) foreach loop if I am only concerned with the 1st table found. Anyway I took your advice and added the outer foreach loop, but I my data file remains empty. Here is the ammended code: `use HTML::TableExtract; my $te = HTML::TableExtract->new( headers => [qr/Month\s/, qr/First\s/, qr/High\s/, qr/Low\s/, qr/Sett\s/, qr/Chg\s/, qr/Vol\s/, qr/GOWAVE\\s/] ); $te->parse_file($sourcefile); my $record; open (DATFILE, ">> meg.dat") or die "Unable to open meg.dat: $!"; print DATFILE "Table:\n"; foreach my $ts ($te->table_states) { foreach my $row ($ts->rows) { $record = join(',', @$row); print $record . "\n"; print DATFILE $record . "\n"; } } close DATFILE;` [download] And this is the html: `<tr align="center" valign="top"> <td><strong>Month </strong></td> <td><strong>First </strong></td> <td><strong>High </strong></td> <td><strong>Low </strong></td> <td bgcolor="#f3f3f3"><strong>Sett </strong></td> <td bgcolor="#f3f3f3"><strong>Chg </strong></td> <td><strong>Vol</strong></td> <td><strong>GOWAVE</strong></td> <td width="1" style="border-bottom-style:none;"></td> <td><strong>Vol</strong></td> <td style="border-right:1px solid #C0C0C0;"><strong>Open Int</strong> +</td> </tr>` [download]	[reply] [d/l] [select]
Re^3: resolving HTML::TableExtract error by crashtest (Curate) on Jul 14, 2005 at 00:34 UTC
The HTML snippet you provided above is not conducive to testing your code. Besides not being enclosed in `<table>` tags, it only has one row (the header). Both (apparently) prevent the HTML from being parsed into a `table_state`. Once I fixed that, your code (with haoess's extra loop over the tablestates) started producing data. One note of caution: According to the documentation, you should be passing regular expression strings to the constructor, not actual regular expressions. I.e., your constructor should look like: `my $te = HTML::TableExtract->new( headers => [ qw( Month\s* First\s* High\s* ... )] );` [download] ... although your constructor with the `qr//`'s was working as well. I had no trouble using the `rows` method on the table extract object directly, as in your original post. That makes me wonder whether you grabbed an older version off CPAN. I'm guessing the shorthand `rows` method in the `HTML::TableExtract` class might have been added somewhere down the line. The version I have is 1.10. Hope this helps...	[reply] [d/l] [select]
Re^4: resolving HTML::TableExtract error by jaydon (Novice) on Jul 14, 2005 at 15:43 UTC