Special_K has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to use WWW::Mechanize to automate navigation of a webpage that requires me to click on at least one button that is tied to javascript code. I know that WWW::Mechanize does not support javascript but am trying to avoid using WWW::Mechanize::Firefox because I don't want to deal with the GUI.

Given that, I am trying to decipher the data that is transmitted when I navigate through the website using a browser to see if I can just manually do what the javascript is doing within my perl code.

When I click the button manually, here is the POST data:



cbd_ria:true ANSWER:******* LoginForm:ANSWER:******* LoginForm:DEVICE:true ANTI_CSRF_TOKEN:9ec2f31c-4008-4bce-b759-3b3aceda6f42 LoginForm:LoginForm


Here is the response header:



Content-language:en-US Content-length:0 Date:Fri, 04 Jul 2014 04:12:25 GMT Location:https://****************************** P3p:CP='NOI CURa ADMa DEVa TAIa OUR BUS IND UNI COM NAV INT' Set-cookie:PMData=PMV6GDrDYODTPUcMqubLWUJOi%2BhcTJGSUaC%2B%2Fk0FN2ITW8 +Isa1Awx90hD%2B3wCtpjInRT5Gw%2Bye28N9Tr6uhoe8T9R30g%3D%3D;path=/;HttpO +nly; Expires=Sat, 04 Jul 2015 04:12:24 GMT; Domain=**********.com; Se +cure X-frame-options:SAMEORIGIN


When I manually fill out the boxes corresponding to LoginForm:ANSWER and LoginForm:DEVICE and then click the continue button, the website authenticates my responses and then takes me to the location specified in the "Location:" field of the response header, which is where I intend to go.

How would I replicate what the javascript is doing within perl? Do I just fill out all the fields in the POST data above, and then GET the Location: URL?

Filling out the LoginForm:ANSWER and LoginForm:DEVICE fields is easy enough, but what about the ANTI_CSRF_TOKEN? I have no idea how to generate it. Is it even required?

Also I did some searching and stumbled on a few perl javascript engines such as JavaScript::SpiderMonkey, WWW-Scripter-Plugin-JavaScript, and WWW-Mechanize-PhantomJS. All of these have an eval() function. The button I am trying to press has the following code tied to it:



<input type="button" value="Continue" tabindex="1" onkeypress="fd.butt +on._keyPress(event, 'LoginForm:continue')" initialtabindex="1" class= +"btn" id="LoginForm:continueInput">


In this case, the button appears to take the following action:



fd.button._keyPress(event, 'LoginForm:continue')


If I use one of the javascript modules, is it literally as simple as calling eval on the function bound to the button once I have set the appropriate variables (I assume the JS function will handle the ANTI_CSRF_TOKEN?)

Replies are listed 'Best First'.
Re: WWW::Mechanize and buttons tied to javascript
by Anonymous Monk on Jul 04, 2014 at 06:51 UTC

    WWW::Mechanize::Firefox because I don't want to deal with the GUI.

    Use WWW::Mechanize::PhantomJS then, comes without GUI, supports javascript, simple to install prereqisites at http://phantomjs.org/download.html

    ... here is the POST data:

    That doesn't look very much like POST data, not any regular kind of POST data, where did you get that from?

    to see if I can just manually do what the javascript is doing within my perl code

    Maybe you realize this -- but its not going well -- take the easy way out :)

    How would I replicate what the javascript is doing within perl? Do I just fill out all the fields in the POST data above, and then GET the Location: URL?

    Well, you'd do it by figuring out and knowing what the javascript does and the way it does it, then replicate that request -- do you know javascript?

    You could also intuit from the request what to do, but this is a lot of guessing, and guessing is a poor painful strategy

    You could also read the docs for the website and use their official api ... guesswork eliminated :)

    What you've posted so far is nowhere near enough information to do any of these things :) but I wouldn't even try, I'd be lazy, I'd use phantomjs

    If I use one of the javascript modules, is it literally as simple as calling eval on the function bound to the button once I have set the appropriate variables (I assume the JS function will handle the ANTI_CSRF_TOKEN?)

    You wish :) in theory maybe... browsers are hard to write ... you could try it to see what happens :)

    Your best chances of success are phantomjs, IMHO naturally

    Best of luck

      That doesn't look very much like POST data, not any regular kind of POST data, where did you get that from?



      Maybe my terminology was incorrect. When I logged into the website via Chrome, that was the data listed under the "Form Data" section of the Network->Headers section of the Chrome Developer Tools.

      Well, you'd do it by figuring out and knowing what the javascript does and the way it does it, then replicate that request -- do you know javascript?



      I only know a little bit, but that was kind of why I was asking this question. How would I figure out what the website is doing? As I said in my post, it appears to be taking the form data I entered, verifying the responses, and GETing the location URL I posted above. The only part I really don't understand is the ANTI_CSRF_TOKEN.

      I will look into phamtomjs.

        Maybe my terminology was incorrect. When I logged into the website via Chrome, that was the data listed under the "Form Data" section of the Network->Headers section of the Chrome Developer Tools.

        :) Well, your terminology appears correct ... form data usualy comes delimited with "=" where as headers are delimited with ":" ... the data you've shown has two:two:two ... that confused me a bit ... I don't have chrome :)

        I only know a little bit, but that was kind of why I was asking this question. How would I figure out what the website is doing?

        :) I already explained -- I'd do it by figuring out and knowing what the javascript does and the way it does it -- this means I'd read the javascript program -- try a few of the functions (firefox ctrl+shift+k or ctrl+shift+j ) -- the open source version of reverse engineering

        lots of time they just fetch this var, fetch that one, do some simple substitution/addition ... sometimes they md5 or sha256 some part and set a new variable ... sometimes its considerably more pointless contortions (esp difficult to discern when they use jsmin type compressors)

        The only part I really don't understand is the ANTI_CSRF_TOKEN.

        Its just another form variable , its given to your browser to prevent replay attacks and session riding... CSRF ... https://www.owasp.org/index.php/Session_Management#Page_and_Form_Tokens , Cryptographic nonce, Plack::Middleware::CSRFBlock