in reply to Re: Wanted: LWP with javascript
in thread Wanted: LWP with javascript

Thanks for he feedback. I think maybe I didn't communicate my desire correctly. I want to avoid using a browser at all. I would like to be able to run a command line version of the script. The problem is that, using LWP, when the example URL is opened, the content is all javascript. Having the browser save the text of the page works because the browser understands how to execute javascript. I am looking for some form of LWP style package where I can get that content without having to manually save the content of the page to a file.

Replies are listed 'Best First'.
Re^3: Wanted: LWP with javascript
by dorward (Curate) on Jan 20, 2006 at 00:12 UTC

    You don't have to manually save the content of the page to a file. The previous poster just did that for the sake of their investigation. All you have to do is replace the code that slurps a file into a scalar with code that slurps an HTTP resource into a scalar, and LWP can do that easily.

      I guess what I am saying is that there is a VERY BIG difference between what happens when you run $ua->get("http://www.nyse.com/about/listed/lc_A.html") and what happens when you File->SaveAs from a browser. The first gives you javascriptese as the page since it is the job of the browser to run javascript and generate html. If you dump from the browser, the javascript has already been interpreted and you get html. I can parse html. I am not so good in most cases with javascript. I was hoping there would be some javascript enabled version of LWP that would crunch the javascript in the returned content and give me HTML, not javascript.

        You clearly did not try the code given by InfiniteSilence against the HTML returned by LWP, because if you had you would see that it worked perfectly. It is running against the "Javascript version" of the page.

        By the way the answer to your original question is that no, it is not possible to do that without browser automation, because you need not only a Javascript interpreter but a full-fledged DOM engine and HTML processor of exactly the sort contained in web browsers, in order to make sense of that script.