in reply to Handling Javascript with LWP::UserAgent

The problem is simply that LWP::UserAgent does not know about and does not handle Javascript. You need to filter out the interesting parts of the Javascript yourself and react accordingly.

  • Comment on Re: Handling Javascript with LWP::UserAgent

Replies are listed 'Best First'.
Re^2: Handling Javascript with LWP::UserAgent
by mrguy123 (Hermit) on Jul 09, 2006 at 15:37 UTC
    Hi. This is an example of a page I retrieved that uses JavaScript. The code is as so:
    #!/usr/bin/perl use strict; use LWP::UserAgent; { my $ua = new LWP::UserAgent(); my $search_address = "http://online.wsj.com/search/full.html?"; #creating the request object my $req = new HTTP::Request ('GET', $search_address); #sending the request my $res = $ua->request($req); if (!($res->is_success)){ warn "Warning:".$res->message."\n"; } my $response = $res->headers_as_string(); my $response .= $res->content(); print "$response\n"; }
    If you run this code you should get a response that has Javascripts. As you can see the code is basically the same except for the URL.
      The JavaScript on the page can do many different things. In some cases javascripting can be ignored, in some cases cannot.

      In general, if you can work with the page with Javascript turned off in your favorite browser without any loss of essential functionality — you can easily work with the page using LWP::UserAgent.

Re^2: Handling Javascript with LWP::UserAgent
by mrguy123 (Hermit) on Jul 09, 2006 at 15:19 UTC
    I have been able to retrieve pages with Javascript in the past. What's different now?

      Since you do not tell us what those past pages looked like (in terms of Javascript), how could we possibly tell you what has changed? We can guess -- perhaps those pages you speak of did not use Javascript for navigational purposes. You could try the Javascript CPAN module. I used this successfully several years ago to decrypt an encrypted web page, but something tells me you might need to parse out the href locations yourself and feed those back to LWP.

      jeffa

      L-LL-L--L-LL-L--L-LL-L--
      -R--R-RR-R--R-RR-R--R-RR
      B--B--B--B--B--B--B--B--
      H---H---H---H---H---H---
      (the triplet paradiddle with high-hat)
      
        Since JavaScript doesn't include a Browser Object Model, pages that rely on browser objects (window, document, history, etc.) don't work with it, so your script doesn't really demonstrate a solution using that module with LWP::UserAgent to handle pages with JavaScript.