Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have been having trouble getting the HTML::Form library to work under Fedora Core 1 (perl 5.8.1-92 with perl-libwww-perl-5.65-6). For some reason, the function HTML::Form>Parse doesn't return any forms for me.

For example, the following simple perl script:

#! /usr/bin/perl use HTML::Form; use HTTP::Request; use LWP; my $ua = new LWP::UserAgent; my $uri="http://www.google.com"; my $req = HTTP::Request->new(GET => $uri); my $res = $ua->request($req); print $res->content; my $form = HTML::Form->parse($res->content, $res->base()); print "Form: ${form}\n";
Returns the source of the www.google.com page but fails to return the obvious embedded form.

If instead, I make the last line:

join(" ", $form->form);
I get the corresponding error message:

"Can't call method "form" on an undefined value at ./myscript.pl line 14"

which occurs presumably because no form is returned!

I did not have any problems with this previously under RH8.0 with perl 5.8.0-88 and perl-libwww-perl-5.65-2.noarch.rpm

In fact, the above perl scripts still work when under Fedora Core 1 I 'chroot' to my old RH8.0 installation. It thus seems to be that something is wrong with the Fedora Core 1 perl environment.

Any suggestions on what might be going on and how I might troubleshoot it further?

Replies are listed 'Best First'.
Re: HTML::Form->Parse (Perl) not working under Fedora Core 1
by Roger (Parson) on Dec 23, 2003 at 05:31 UTC
    How about using the Data::Dumper module to inspect the form object for you?

    use strict; use warnings; use HTML::Form; use Data::Dumper; my $html = do { local $/; <DATA> }; my $form = HTML::Form->parse($html, "http://www.google.com.au"); print Dumper($form); __DATA__ # stick the HTML source here ...


      Thanks for the suggestion.

      However, as I expected, adding this code yields the following behavior:

      1. On the Fedora Core 1 system running perl 5.8.1:

       $VAR1 = undef;

      2. On my old RH8 system running perl 5.8.0:

      $VAR1 = bless( { 'extra_attr' => { 'name' => 'f' }, 'enctype' => 'application/x-www-form-urlencoded', 'action' => bless( do{\(my $o = 'http://www.google.co +m/search')}, 'URI::http' ), 'method' => 'GET', 'inputs' => [ bless( { 'value' => 'en', 'name' => 'hl', 'type' => 'hidden' }, 'HTML::Form::TextInput' ), bless( { 'value' => 'ISO-8859-1', 'name' => 'ie', 'type' => 'hidden' }, 'HTML::Form::TextInput' ), bless( { 'maxlength' => '256', 'value' => '', 'name' => 'q', 'type' => 'text', 'size' => '55' }, 'HTML::Form::TextInput' ), bless( { 'value' => 'Google Search', 'name' => 'btnG', 'type' => 'submit' }, 'HTML::Form::SubmitInput' ), bless( { 'value' => 'I\'m Feeling Lucky +', 'name' => 'btnI', 'type' => 'submit' }, 'HTML::Form::SubmitInput' ) ] }, 'HTML::Form' );
      So again, for some reason the same code that parses the form properly under RH8.0/perl 5.8.0 is not parsing the form properly under Fedora Core/perl 5.8.1. Note that under both systems, I have no trouble just getting the html source -- the problem appears to only be with forms (also this problem happens on all websites -- google of course is just an example...)
        I did some brute-force troubleshooting using trusty "print" statements to trace the problem back from my script to the Form.pm module to the TokeParser.pm module.

        Specifically, HTML::Form->Parse, calls "get_tag" (in TokeParser.pm) which returns 'undef'. This occurs, because in "get_tag", the "$self->get_token" statement keeps returning tokens of type "T" until it runs out of tokens.

        When I do the same debugging on the working RH8.0/perl5.8.0 version, I get a mix of tokens of types "T", "S", and "E" so that 'undef' is not returned and things work ok.

        My limited perl skills did not allow me to trace this back further, but hopefully this will shed some light on the problem...

        Any thoughts on what might be causing all of this?