Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to write a simple script to download a list of results available for the month of dec from a website. However, this is my first time trying to post to a form and I'm getting nowhere. The results I'm getting are as if I just called up the page and never hit the submit button. What am I doing wrong?
#! /usr/bin/perl use strict; use warnings; use LWP; my $browser = LWP::UserAgent->new; my $url = 'http://www.gulfgreyhound.com/officialracingresults.asp'; my $response = $browser->post( $url, [ "subfolder" => "dec03", ] ); die "$url error: ", $response->status_line unless $response->is_success; die "Weird content type at $url -- ", $response->content_type unless $response->content_type eq 'text/html'; print $response->content;

Replies are listed 'Best First'.
Re: posting to a form to get results
by Roger (Parson) on Dec 24, 2003 at 22:39 UTC
    I had a look at the HTML source of the racing results page. The following fragment defines the post action ...
    <form action=monthresultslist.asp method=post> ...
    As you can see here, you are posting to the wrong URL.

    However, you could use WWW::Mechanize to do the browsing for you...
    #!/usr/bin/perl -w use strict; use WWW::Mechanize; my $url = 'http://www.gulfgreyhound.com/officialracingresults.asp'; my $robot = new WWW::Mechanize; $robot->get($url); $robot->form_number('1'); $robot->set_fields('subfolder' => 'dec03'); $robot->click(); # Get the reply to my question my $html = $robot->content(); print "$html";
      You guys are great! Thanks for the help!
      Very nice example. I didn't actually look at the HTML source; I just used lynx, hit the submit button, and then '=' to see what page it went to and what the POST data was. Very convenient (for lynx friendly pages, anyway).
Re: posting to a form to get results
by b10m (Vicar) on Dec 24, 2003 at 23:06 UTC

    This reply is mere a suggestion to monks that hit this node by supersearching, for I believe it's not relevant to your specific example.

    Sometimes, the results of posting something with LWP, may look like nothing has happened/been posted. In this case, make sure there are no hidden fields that you missed (some BOFHs even create "random" values for hidden fields). Besides that, also check the use of cookie-stored session IDs (or other values stored in a cookie). Many sites use that too and disallow any posting without such a cookie. Annoying, but oh well, if a you can do something with your browser, you can do the same with LWP :) I, myself, always quite enjoy working around these attempts to ban scripts :)

    If interested in the cookie stuff, see HTTP::Cookies.

    --
    b10m

      Good advice. Another general tip is to use a common User Agent string, such as

      Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
      because some sites disallow access based on the User Agent string as well. This looks like a decent list of Agents. Search google for more.

      Also, start the ball rolling with WWW:Mechanize, if for no other reason than it handles cookies for you transparently. Just login "as usual" and you have a session.

      jeffa

      L-LL-L--L-LL-L--L-LL-L--
      -R--R-RR-R--R-RR-R--R-RR
      B--B--B--B--B--B--B--B--
      H---H---H---H---H---H---
      (the triplet paradiddle with high-hat)
      
Re: posting to a form to get results
by ysth (Canon) on Dec 24, 2003 at 22:37 UTC
    It appears to use a different URL to process the form. Try monthresultslist.asp
Re: posting to a form to get results
by Anonymous Monk on Dec 24, 2003 at 22:53 UTC
    <shameless_plug>
    gulfgreyhound.com huh? If you are in the Houston area I invite you out to the Houston Perl Mongers meetings.
    Houston Perl Mongers Info.
    </shameless_plug>