ray.rick.mini has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I'm trying to extract text data from a webpage, with many javascript code in it. I'm able to go through the page, but when I'm finally there, I'm not able to get the information, since they are retained in javascript runtime variables (should the name for those be DOM? pretty confused). I identified wanted text section through FireBug, in the DOM panel section. The DOM object where they are retained seems like an array, that is called Diary. I'm not able to access it in perl, using eval() or eval_in_page() methods. I tried this piece of code:

my ($contest, $type) = $mech2->eval_in_page( 'Diary' ) or warn "$!"; print Dumper \$contest; print Dumper \$type;

Resulting in:

MozRepl::RemoteObject: ReferenceError: Diary is not defined at ./test.pl line 144.

Of course content() or text() methods return only empty textareas.. I'm searching good suggestions. I would like if possibile to inject JS code to dump every single variable that is readable in current page context..I'm afraid that Diary is not readable or out of scope..There is a way to do this? thanks for any help or good suggestion.

Replies are listed 'Best First'.
Re: Javascript variables access help with WWW::Mechanize::Firefox
by Anonymous Monk on Sep 28, 2016 at 23:28 UTC

    The DOM object where they are retained seems like an array, that is called Diary

    In javascript, the "document" is called Window

    To retrieve an an element object from the document, use xpath

      Hello, could you explain better what you mean with an xpath expression example? I already use xpath method on several parts of the script, but only for searching HTML elements in the page, not for values stored in js variables. I use for example:

      my $xpath= '//td[@class="test"]' ; ..

      and...thanks for your reply!

        Hello, could you explain better what you mean with an xpath expression example? I already use xpath method on several parts of the script, but only for searching HTML elements in the page, not for values stored in js variables. I use for example:

        The variable "window" is the DOM, its all the html elements that exist on the webpage you're on,

        The FireBug DOM panel and the HTML panel both represent "window", the current page, the current document , the current dom, the dom, dom

        So use the HTML panel, "Copy XPath", and give it to mechanize xpath method, to retrieve the object for the element of the text you're interested in