in reply to Re^6: Automate WebLogin
in thread Automate WebLogin

I'm not sure what part of my advice you have problems with. I try to make the steps clear, so please tell me what step I didn't explain thoroughly enough:

Learn what your browser sends, then send that from Perl.

This is meant to tell you to investigate what data your browser sends to the remote webserver when you click a button. The intention behind is that the remote webserver cannot know whether there is a browser or a Perl script at the other side, so as long as your Perl script sends the same data as a browser, it will never find out.

For example, using the Live HTTP Headers extension.

This sentence is to show you a tool that can do the above.

Or learn Javascript and how it interacts with the HTML DOM, and what the click for a submit button does.

This sentence is intended to show you the other, more static, approach to scraping. Read (and understand) the Javascript and what it does, and directly replicate that from Perl.

Or just modify the code to find it out.

This sentence is to show you a variation on the more static approach. By modifying the Javascript code, you can also maybe find out what it does and what purpose it has.

Replies are listed 'Best First'.
Re^8: Automate WebLogin
by libvenus (Sexton) on Jan 26, 2011 at 15:46 UTC

    Hi Corion - Sorry for the confusion but I was not able to understand the Live HTTP Headers articles.Probably because of the kind of experience I have.I didn't give the second one a try(learning Javascript).Tried the last one and modified the Javascript response i got but got stuck with the click.I would like to follow more on this

      The Live HTTP Headers are a plug-in for Firefox, a web browser in common usage. They allow you to see what your browser sends, so that you can replicate that from Perl.

      If you are automating websites using Javascript, some amount of Javascript knowledge is inevitable. I recommend you try to at least vaguely understand the Javscript that the pages use.

      None of the solutions can shield you from understanding what goes on between the web browser and the webserver, so I recommend you start familiarizing yourself with that.

        Hi Corion - I was able to install the Live HTTP Headers firefox plugin and capture the data being exchanged b/w the webserver and browser.

        Here is the last request and response header before finally logging into the application:

        https://intra.xyz.com/cmpm/login.cfm?dparm=main GET /cmpm/login.cfm?dparm=main HTTP/1.1 Host: intra.xyz.com User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:7.0.1) Gecko/20100101 Fire +fox/7.0.1 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0. +8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip, deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Connection: keep-alive Referer: https://www.e-access.xyz.com/pagLogin/?retURL=hxtps%3A%2F%2Fi +ntra%2Exyz%2Ecom%2Fcmpm%2Flogin%2Ecfm%3Fdparm%3Dmain&sysName=EBIZ Cookie: CFID=73908; CFTOKEN=fe84f551934ee35f-7BA356FE-1517-3A9B-94017F +2054750B94; xyzESSec=0080Vfb7DN-d75F8qhbStTSlNXPp8VDwgQ8swtr41Im90s3l +HiRb6FC4BzZFGGqHhuL5gJ5XgAk8fYffffiHzHsgQx-7KFGEdfSV2LopC20rbI_l795Xs +8Wk36Ce8ldkmSMkjyyHgDnh9cqEP6RXa6AuoEUEaZT6pzpiadmu6b5_pApNkoHavdnSPg +S8gLMT2gF_8AChkQj1GT7Kxj78sEI-G5ZXiX819ryZ0wd-W0Cp6pdpHbfYqPUPhkE9fQ1 +L9RjF; xyzESHr=USER%7cNAME%7cUSERID%40ap%2exyz%2ecom%7c%2b1+816+82365 +59%7c%7cjs7856%7c%7cUSERID%2cRTRRDYJ%2cT24B945%2c8620265%7cYNNNNNNNNN +NNNNYNYYNNNNNN%7cUSER%7cEY1269A00%7c; xyzESg2=006cVfb7DN-d75F8qhbStTS +lNXPp8VDwgQ8swtr41Im90s0E6WCok9TyV4ePylbGdQXhm5O-RU7O7uZEAC5GGTPAxNSU +Ha5wvFW_irLNTc4v4PM.zmfWPhfqyBLeXQMbssRNdlgYXv17VWw0cOpxRCuaSHQYU04zT +nkN20aFNHSdm4hPi4F6l2k1WENRmDaDszWa6tBx_07mAC3kzuvIKl1hD8_rxJofmk6_SU +gEox5_he5j HTTP/1.1 200 OK Server: Sun-Java-System-Web-Server/7.0 Date: Thu, 06 Oct 2011 12:18:10 GMT Content-Type: text/html; charset=UTF-8 Transfer-Encoding: chunked

      I find Firebug helpful when debugging JavaScript, it's not a substitute for learning JavaScript but it can help pinpoint problems.