Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

Re: Remote Login and Form Posting

by Corion (Patriarch)
on Jun 30, 2004 at 07:59 UTC ( #370707=note: print w/replies, xml ) Need Help??

in reply to Remote Login and Form Posting

Whenever you are automating a website, it is of utmost importance to trace what your "real" browser sends and then to try to replicate that with your script. I can recommend the Mozilla HTTP Liveheaders or HTTP::Recorder to sniff the traffic between your browser and the website.

WWW::Mechanize specifically caters towards behaving like a "normal" browser, so you are maybe better off using WWW::Mechanize instead of LWP::UserAgent directly. All methods of LWP::UserAgent are available via WWW::Mechanize as well, so you don't lose anything, but you get magic redirect and referrer handling.

While you are developing your website automation, you will often want to compare what your browser sends and what your script sends, and make the messages as identical as possible, so having a diff utility handy and pumping up the LWP logging might be helpfull too.

Replies are listed 'Best First'.
Re^2: Remote Login and Form Posting
by powerhouse (Friar) on Jul 01, 2004 at 07:04 UTC
    I have looked at that module and I don't see how to use a cookie_jar with it to first login to their website. I am looking through the module on cpan with the doc therein contained.

    I have put a TON of time into this and don't want to start at ground zero, unless I have to. Therefore, how would I get what I have already written to work?

    I added all the form fields to the LWP version I made, but it does not seem to be doing it right. Is there a way to make this code here work:
    use HTTP::Request; use HTTP::Cookies; my $_cookie_file = "/home/first/cook/.lwp_cookies.dat"; $_post_url = ''; # Create new LWP::UserAgent object $browser = LWP::UserAgent->new; # Add a new Cookie Jar to the LWP Object $browser->cookie_jar(HTTP::Cookies->new("file" => $_cookie_file)); # Spoof a very capable browser $browser->agent('Mozilla/5.0'); # Allow https protocol and http.... $browser->protocols_allowed( [ 'http', 'https'] ); # Add the HTTP::Request for POST... $req = HTTP::Request->new("POST" => "$_post_url"); # Add "content type" as form encoded $req->content_type("application/x-www-form-urlencoded"); # Add the "Form fields" that are contained in the page... $req->content("IWPEProcessFlow.submitted.sequenceID=&logonUsername=m +yhiddenusername&logonPassword=myhiddenpass&Logon=Login&hMsgUserNotFou +nd=User with this name was not found.&hMsgInvalidPassword=Logon not v +alid.&fromHomePage=default_B2B.htm"); # Spoof the referrer to make the site think it's a person # from the login page... $req->referrer(" +B_UMLogon.tem"); # Ok, now Perform Task $response = $browser->request($req); if($response->is_success) { # This is not working as it is posting in the 'ELSE' statement. It is +actually posting this in the "$response->content" field: START $response->content <HTML> <!-- IWLoadCurrentCustomerLocale.tem - -- Copyright (c) 1997 InterWorld Corporation --> <HEAD> <META HTTP-EQUIV=REFRESH CONTENT="1;URL=/E2B_UMLogOn.process"> <TITLE>Refresh User Repository</TITLE> </HEAD> <BODY BGCOLOR="#ffffff"> <!--<A HREF="/E2B_UMLogOn.process"> <FONT SIZE="3" FACE="verdana, arial, helvetica">Please click here to c +ontinue...</FONT></A> --> </BODY> </HTML> END $response->content
    If not I'll try to figure out how to use the WWW::Mechanize module.

    Thank you for any further assistance you can be.

      I have put a TON of time into this and don't want to start at ground zero, unless I have to. Therefore, how would I get what I have already written to work?

      In short, no. Putting "TONs of time" into automating a web site without looking at a working connection will never work, or at least require inordinate amounts of "TONs of time".

      I recommend that you record a complete session with the actual browser, and that you don't fake a non-existing browser that has the UserAgent header Mozilla/5.0 - please look at the traffic sent by LWP and compare it to the traffic sent by your browser. Pretending to be a browser starts by sending the same data as the browser.

      I can only recommend to you to try a different route and let the code you alreday have lie still for a while. Try the approach of imitating a browser by mimicking exactly what it does by actually striving to send the data identically to what the browser would send.

      HTTP::Recorder (by leira) and WWW::Mechanize::Shell (by me) are two modules that will try to write WWW::Mechanize scripts for you, and maybe they will already suffice, but I cannot stress enough the point of comparing at what your script is sending against the working solution of what your browser is sending.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://370707]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (3)
As of 2022-12-04 23:06 GMT
Find Nodes?
    Voting Booth?

    No recent polls found