redss has asked for the wisdom of the Perl Monks concerning the following question:

I want a script to retrieve the html response from the login page from a website, and I want to populate the username & password form inputs, and submit the login form and retrieve the response from that submission. Cookies not required.

The catch is: the response from initially retrieving the login page contains hidden form inputs that are necessary for the login to succeed.

Can I do that using LWP::UserAgent (or another module)?

Replies are listed 'Best First'.
Re: how to submit html form?
by dorko (Prior) on Nov 27, 2005 at 15:06 UTC
    You could do it with LWP::UserAgent, but you'll want to look at WWW::Mechanize. From the docs for WWW::Mechanize:
    Each fetched page is parsed and its links and forms are extracted. A link or a form can be selected, form fields can be filled and the next page can be fetched.
    And it sounds like that's exactly what you want to do.

    Cheers,

    Brent

    -- Yeah, I'm a Delt.
Re: how to submit html form?
by davido (Cardinal) on Nov 27, 2005 at 17:35 UTC

    It's pretty simple with WWW::Mechanize. That module offers a suite of form methods, and as for submitting button clicks, there are the submit() and click() methods (among others). Reading hidden fields is not a problem. Hidden fields are only hidden by browsers that choose not to render them (which is basically all browers), but there's no reason your script shouldn't have access to them. They're included in the HTML as you get it from the server.


    Dave

      I hear a lot about WWW::Mechanize, but I don't quite understand the point of what it allows you to do. I've read the docs, but not sure of it's real life application. Can someone give me a quick synopsis? Thanks.


      —Brad
      "The important work of moving the world forward does not wait to be done by perfect men." George Eliot
        Well I had a task at work to get the mileage from two of our offices to a bunch of cities to see who was closer. Using WWW::Mechanize I went to one website to grab the zipcodes of all the cities. I did have to write a wrapper script to sleep for a bit as it only allowed me to look up 6 zip codes at a time though.

        Next I took that list, and plugged it in to MapQuest and scraped for the distances.Needless to say, without WWW::Mechanize it would have taken me hours to do this by hand, while it took mere minutes to have the computer do it.

        There is a great joy in writing something automated, going for a quick break, and coming back to find all your work done. :)

        Useless trivia: In the 2004 Las Vegas phone book there are approximately 28 pages of ads for massage, but almost 200 for lawyers.

        WWW::Mechanize is a subclass of LWP::UserAgent, so it comes with all the functionality of UserAgent, plus its own additional functionality. What it is good for is acting like a web browser, while providing easy methods of automating many things that a web browser could do. This includes following links, handling forms, submitting forms, and so on.

        The WWW::Mechanize documentation has some examples, but to me it made more sense once I dove into starting to use it. Think of a simple task you would like to automate that involves WWW screen scraping. Then sit down with the WWW::Mechanize docs, and figure out how you're going to accomplish this task using WWW::Mechanize (keeping in mind that it's a proper subclass of LWP::UserAgent). Fiddle and tweak, and eventually you'll figure out why it's useful.


        Dave

Re: how to submit html form?
by gu (Beadle) on Nov 27, 2005 at 19:26 UTC
    It's as easy as :
    use WWW::Mechanize ; my $m = WWW::Mechanize->new ; my %conf = ( foo => bar ) ; $m->get("http://your.url/") ; die $m->res->status_line unless $m->success ; # Choose form number $m->form_number(1) ; # Fill $m->set_fields( %conf ) ; # Submit $m->submit ; die $m->res->status_line unless $m->success ; # If the form sends you somewhere, you can catch it : my $url = $m->response->request->uri->as_string ;
    Then if you want to process HTML data from a page, I recommend you use HTML::TreeBuilder, for example as :
    $m->get("$url") ; die $m->res->status_line unless $m->success ; my $tree = HTML::TreeBuilder->new_from_content( $m->content ) ;

    Gu
Re: how to submit html form?
by planetscape (Chancellor) on Nov 27, 2005 at 23:25 UTC