OfficeLinebacker has asked for the wisdom of the Perl Monks concerning the following question:

Greetings, esteemed monks!

I get this...esoteric? error: Can't bless non-reference value at C:/Perl64/site/lib/HTML/ElementTable.pm line 431. With the following code:

#!/usr/bin/perl use strict; use warnings; use LWP::UserAgent; use WWW::Mechanize; use Readonly; use HTML::TreeBuilder; #use HTML::Element qw(Table); #use HTML::TableExtract qw(tree); use HTML::TableExtract; use HTML::Encoding 'encoding_from_http_message'; use Encode; use File::Slurp; Readonly::Scalar my $url => 'http://www.emarketplace.state.pa.us/Searc +h.aspx'; my $mech = WWW::Mechanize->new( agent => 'Mozilla/5.0 (Windows NT 6.1; + WOW64; rv:6.0.2) Gecko/20100101 Firefox/6.0.2' ); $mech->get($url); #There is only one form on the page and they start at 1 in WWW:Mechani +ze my $form = $mech->form_number(1); # 'wucSearch$btnSearch' is the NAME of the button we want to press; 'w +ucSearch_btnSearch' is the id # 'wucSearchResults$ddlRows' is the NAME of the input item we want to +set to 'ALL' $mech->select('wucSearchResults$ddlRows','ALL'); my $response = $mech->click_button(name => 'wucSearch$btnSearch'); if ($response->is_success) { #print $response->decoded_content; # or whatever } else { die $response->status_line; } my $HTML = $response->decoded_content; #my $te = HTML::TableExtract->new(slice_columns=> 0, keep_html => 1);# +, headers => ["Solicitation#"]); my $te = HTML::TableExtract->new(); $te->parse($HTML);
I tried taking out the qw(tree) in the use HTML::TableExtract line.

I see one other node about a similar error: Net::Packet::Dump can't bless non-reference as 'IO::File' but I don't really see how this would apply to me.

Without further ado, how do I fix this? Thanks!


I like computer programming because it's like Legos for the mind.

Replies are listed 'Best First'.
Re: "Can't bless non-reference value" error with HTML::TableExtract
by Perlbotics (Archbishop) on Sep 16, 2011 at 22:56 UTC

    The error occurs, when HTML::ElementTable::new_from_tree tries to bless an undefined value in $cell to HTML::ElementTable::DataElement here:

    ... if ($grid_row->[$c]) { #orig: bless $cell, 'HTML::ElementTable::DataElement'; bless $cell, 'HTML::ElementTable::DataElement' if defined $cel +l; #patch next; } ...

    I am not sure what I am doing here (just enough knowledge to be dangerous ;-) ... but the following patch seems to work - at least if you are using the tree-variant (the patch is not required when used with the non-tree variant of HTML::ElementTable). First, the patch shown above is dynamically applied, then the columns 'Type' and 'Agency' are extracted and printed. Furthermore, simple caching was added since the download has roughly a size of 1MB.
    BTW: Do you have permission to scrape this information?

    I am not sure, if this is a bug in the module and/or a consequence of an ill formatted HTML document. Iff it's the first, please notify the module author. Hope, this gets you a step forward...

    Update: Bah, sometime sleep helps... when using the tree variant, define and provide proper @headers:

    ... use HTML::TableExtract qw(tree); ... my @headers = ("Solicitation #", qw(Types Agency)); my $te = HTML::TableExtract->new( slice_columns=> 1, keep_html => 0, # comment next line for "bless-error" headers => \@headers ); $te->parse($HTML); foreach my $ts ($te->tables) { print "======= Table (", join(',', $ts->coords), ") =======\n"; print join("\t", @headers), "\n"; foreach my $row ($ts->rows) { print join("\t", map { $_->as_trimmed_text() } @$row), "\n"; } }
    Note: It's also Solicitation #, not Solicitation# (no match). You might still experiment with the method patch though. The original patch itself is rubbish, sorry.
      Dang thoughtful response, guv'nor. Just getting back to this now. I'll play with it and report back.

      Edit: heh, works beautifully. As far as permission to scrape, if it's a public website and it's not an overloading operation, why not? Am I missing something?

      And good catch on the Solicitation# thing.

      Now to try to learn what you did instead of just blindly using it.

      Cheers!


      I like computer programming because it's like Legos for the mind.