How to download content generated by Javascript

lihao has asked for the wisdom of the Perl Monks concerning the following question:

Hi, folks:

I am trying to parse some webpages which contain content generated by using Javascript function, for example, the source of HTML shows something like the following:(just a sample with jQuery as the JavaScript library)

  <div id="content"></div>
  <script type="text/javascript">
    var value = from_javascript_function(...);
    $(div#content).html(value);
  </script>
[download]

How can I grab the content displayed in <div>(id="content") on the web browser ?? Any Perl modules or some other tools... Many thanks for your suggestions

Regards,
lihao

Comment on How to download content generated by Javascript Download Code

Replies are listed 'Best First'.
Re: How to download content generated by Javascript by runrig (Abbot) on Mar 19, 2008 at 22:01 UTC
You can follow the advice above and use something that handles JavaScript, or you can install something like the Live HTTP headers plugin for Firefox, and examine for yourself what actually gets sent on HTTP requests, and then use something simple like LWP or WWW::Mechanize to fetch the content, which, if you can get it working this way, will run faster than using the JavaScript-enabled plugin methods. Though the JavaScript-enabled plugin method might be faster to develop.	[reply]
Re: How to download content generated by Javascript by pc88mxer (Vicar) on Mar 19, 2008 at 21:49 UTC
Unfortunately, I believe perl is the wrong tool to use for web-scraping these days. It's fine for Web 1.0 applications, but for 2.0 apps you are much better off using the browser itself. You really need a complete Javascript/DOM environment to do it adequately. I'd investigate Firefox plugins like GreaseMonkey or Selenium. Selenium is controllable via perl, so there is still room for perl, but all the heavy lifting is going to be done by Firefox and the Selenium plugin.	[reply]


Just another Perl shrine
	PerlMonks