"Can't bless non-reference value" error with HTML::TableExtract

OfficeLinebacker has asked for the wisdom of the Perl Monks concerning the following question:

Greetings, esteemed monks!

I get this...esoteric? error: Can't bless non-reference value at C:/Perl64/site/lib/HTML/ElementTable.pm line 431. With the following code:

#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
use WWW::Mechanize;
use Readonly;
use HTML::TreeBuilder;
#use HTML::Element qw(Table);
#use HTML::TableExtract qw(tree);
use HTML::TableExtract;
use HTML::Encoding 'encoding_from_http_message';
use Encode;
use File::Slurp;

Readonly::Scalar my $url => 'http://www.emarketplace.state.pa.us/Searc
+h.aspx';

my $mech = WWW::Mechanize->new( agent => 'Mozilla/5.0 (Windows NT 6.1;
+ WOW64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2' );

$mech->get($url);
#There is only one form on the page and they start at 1 in WWW:Mechani
+ze
my $form = $mech->form_number(1);

# 'wucSearch$btnSearch' is the NAME of the button we want to press; 'w
+ucSearch_btnSearch' is the id
# 'wucSearchResults$ddlRows' is the NAME of the input item we want to 
+set to 'ALL'

$mech->select('wucSearchResults$ddlRows','ALL');

my $response = $mech->click_button(name => 'wucSearch$btnSearch');

 if ($response->is_success) {
     #print $response->decoded_content;  # or whatever
 }
 else {
     die $response->status_line;
 }

my $HTML = $response->decoded_content;

#my $te = HTML::TableExtract->new(slice_columns=> 0, keep_html => 1);#
+, headers => ["Solicitation#"]);
my $te = HTML::TableExtract->new();
$te->parse($HTML);
[download]

I tried taking out the qw(tree) in the use HTML::TableExtract line.

I see one other node about a similar error: Net::Packet::Dump can't bless non-reference as 'IO::File' but I don't really see how this would apply to me.

Without further ado, how do I fix this? Thanks!

I like computer programming because it's like Legos for the mind.

Comment on "Can't bless non-reference value" error with HTML::TableExtract Download Code

Replies are listed 'Best First'.
Re: "Can't bless non-reference value" error with HTML::TableExtract by Perlbotics (Archbishop) on Sep 16, 2011 at 22:56 UTC
The error occurs, when `HTML::ElementTable::new_from_tree` tries to bless an undefined value in `$cell` to `HTML::ElementTable::DataElement` here: `... if ($grid_row->[$c]) { #orig: bless $cell, 'HTML::ElementTable::DataElement'; bless $cell, 'HTML::ElementTable::DataElement' if defined $cel +l; #patch next; } ...` [download] I am not sure what I am doing here (just enough knowledge to be dangerous ;-) ... but the following patch seems to work - at least if you are using the tree-variant (the patch is not required when used with the non-tree variant of HTML::ElementTable). First, the patch shown above is dynamically applied, then the columns 'Type' and 'Agency' are extracted and printed. Furthermore, simple caching was added since the download has roughly a size of 1MB. BTW: Do you have permission to scrape this information? I am not sure, if this is a bug in the module and/or a consequence of an ill formatted HTML document. Iff it's the first, please notify the module author. Hope, this gets you a step forward... Update: Bah, sometime sleep helps... when using the tree variant, define and provide proper `@headers`: `... use HTML::TableExtract qw(tree); ... my @headers = ("Solicitation #", qw(Types Agency)); my $te = HTML::TableExtract->new( slice_columns=> 1, keep_html => 0, # comment next line for "bless-error" headers => \@headers ); $te->parse($HTML); foreach my $ts ($te->tables) { print "======= Table (", join(',', $ts->coords), ") =======\n"; print join("\t", @headers), "\n"; foreach my $row ($ts->rows) { print join("\t", map { $_->as_trimmed_text() } @$row), "\n"; } }` [download] Note: It's also Solicitation #, not Solicitation# (no match). You might still experiment with the method patch though. The original patch itself is rubbish, sorry. Read more... (9 kB)	[reply] [d/l] [select]
Re^2: "Can't bless non-reference value" error with HTML::TableExtract by OfficeLinebacker (Chaplain) on Sep 29, 2011 at 12:49 UTC
Dang thoughtful response, guv'nor. Just getting back to this now. I'll play with it and report back. Edit: heh, works beautifully. As far as permission to scrape, if it's a public website and it's not an overloading operation, why not? Am I missing something? And good catch on the Solicitation# thing. Now to try to learn what you did instead of just blindly using it. Cheers! I like computer programming because it's like Legos for the mind.	[reply]