initself has asked for the wisdom of the Perl Monks concerning the following question:

The following is an attempt to use LWP to populate a form. I want to fill out the form, press submit, and see the result.

Update: Turns out I was not properly submitting the form with HTML::Form. See Corion's reply for details.

When I submit this form, the result I get back in my UserAgent is not what I expect. Instead of a page showing the results based on my form fields, what I get back is the default page again. Because there Javascript embedded into the page I am browsing in my script, which HTML::Form cannot interact with, I tried emulating the Javascript in my code.

Any warnings you see if you try to run the code are harmless per the HTML::Form docs. Naturally a hidden field set to read-only will give a warning when it is populated.

Can anyone see how I need to setup this form for the server to accept my post? Where are some likely places that might have issues in the code?

Here's the code:

#!/usr/bin/perl # browse.pl use strict; use warnings; use CGI ':standard'; use LWP::UserAgent; use HTML::Form; use Data::Dumper; my $browser = LWP::UserAgent->new; my $browse_url = 'http://browseusers.myspace.com/Browse/Browse.aspx'; my $response = $browser->get($browse_url); my @forms = HTML::Form->parse($response); # Pull ACTION out of JavaScript function, replace in FORM element my $content = $response->content; $content =~ m{document\.frmBrowse\.action = "(.*?)"}; my $action_url = "http://browseusers.myspace.com/Browse/" . "$1"; $forms[1]->action($action_url); my $action = $forms[1]->action; # Get Form Elements my $zipRadius = $forms[1]->find_input("zipRadius", "option"); my $zipCode = $forms[1]->find_input("zipCode", "text"); my $Scope = $forms[1]->find_input("Scope", "radio"); # Get Hidden Elements my $__EVENTTARGET = $forms[1]->find_input("__EVENTTARGET"); my $Page = $forms[1]->find_input("Page"); # Assign Values $zipRadius->value("5"); $zipCode->value("92630"); #$Scope->value("scopeMyFriends"); #Populate Hidden Elements (WARNINGS OK) $__EVENTTARGET->value("update"); $Page->value("1"); # Update Form $forms[1]->click("update"); # Get Response from Server $content = $response->content; print $content; # Dump Form (For Testing Only) #print $forms[1]->dump;

Replies are listed 'Best First'.
Re: HTML::Form Submit Issue
by Corion (Patriarch) on Dec 09, 2005 at 07:36 UTC

    Update: I looked closer at your code and the first issue is that you're never sending the filled out form back to the server:

    # Update Form $forms[1]->click("update");

    Use something like the following code, as suggested by the HTML::Form documentation:

    my $filled_out_request = $forms[1]->click; print $filled_out_request->as_string; # for debugging, see below $response = $ua->request($filled_out_request);

    Whenever automating access to a webpage, it is most important to replicate what the browser is sending. There are many tools nowadays to do that:

    • HTTP::Recorder - it is a proxy that sits between your browser and the web and it will create a WWW::Mechanize script that replicates all your actions.
    • For Mozilla there is the Live HTTP Headers extension, which displays all data that is sent in a log file against which you can compare your programs output
    • My own module, Sniffer::HTTP, which also can display all HTTP traffic as it gets sent and received.

    With any of these tools, the process is always to

    1. Get the original data sent by the browser
    2. Write a program to replicate it
    3. Compare the data sent by your program against the original data
    4. Repeat

    Of course, I, as the author, think that my tools (Sniffer::HTTP and HTTP::Request::FromTemplate) are superior, but I use the other tools as well.

      Thank you so much! I took the Live HTTP Headers from Firefox and compared them with what I was sending with:

      my $filled_out_request = $forms[1]->click; print $filled_out_request->as_string;

      I found what I was sending was correct. The trick was a) making the actual request with UserAgent (which 'click' alone does not do) and making sure cookies were setup properly.

      Hats off to Corion!