in reply to Re: WWW::Mechanize and fooling server for javascript
in thread WWW::Mechanize and fooling server for javascript

I thought I replied to this last night but I don't see it. My apologies if I am missing something and this turns out to be a double post.

Thanks for the reply. I thought 'content' returned an HTML object if I didn't use the format. I'll give that a try.

I agree that it seems impossible to not get the whole page but since this is my first time with Mech I don't really know all it does yet.

The bottom line of this exercise is to get ALL the javascript source including that which comes as a link rather then embedded. That was the problem I was having with LWP, the 'GET' only gave me the source for embedded javascript.

For a better description of what I am doing, please see the explanation in Using LWP to automate a login. I am trying to extract the assigned IP address from the ISP for a DSL line. To do that I need to log on to a D-Link EBR-2310 which will serve a status page containing that information. The trick is to authenticate to the router.
  • Comment on Re^2: WWW::Mechanize and fooling server for javascript

Replies are listed 'Best First'.
Re^3: WWW::Mechanize and fooling server for javascript
by marto (Cardinal) on Jul 31, 2007 at 12:32 UTC
    gw1500se,

    Some routers use JavaScript to dynamically create an html page using document.write. As you have already been told LWP and WWW::Mechanize do not support JavaScript, see how to reboot adsl modem with perl? for a similar problem, and list of WWW::Mechanize variants that do support JavaScript.

    I don't know what you mean by 'fooling Server' but depending on the JavaScript in question it is often possible to write some Perl that provides the same function as the JavaScript does. If all you are looking for is the IP address assigned by your ISP, perhaps you could simply use WWW::Mechanize to get http://www.whatismyip.com (or similar service), and parse the response.

    Martin
      What I meant was that I initially thought there was some kind of redirect that prevented the full page from loading if javascript was not enabled. Thus the server would serve a different page unless javascript was enabled. I was looking for a way to fool that mechanism into thinking javascript was enabled.

      I have since been convinced this is not possible so my only alternative is to be able to parse the javascript for the assignment I am looking for (data='some hash string'). I am finding that the challenge is to find something that will let me access the javascript source. It seems that if the javascript is a link rather then embedded, LWP at least, will not "GET" it. I am hoping Mech will when I try it without the format option.

        It's quite easy to see what requests your browser makes, for example with the Live HTTP Headers Extension for FireFox. All you have to do then is to faithfully replicate the requests made by the browser with WWW::Mechanize. In one instance, I used HTTP::Request::FromTemplate to recreate HTTP requests from templates I created from sniffer logs. Other network analysis tools, like WireShark or Sniffer::HTTP could also be useful in determining the difference between what your browser sends and what your script sends.