I think it's probably easier to just reverse engineer the request using HTTP::Recorder or (more low-level) log your browsers actual request using a basic HTTP proxy.

update: The spidermonkey javascript engine only does Javascript. It has no concept of a browser: that means no document, no DOM, no HTML forms. A simple document.write() will not work because there is no document object. You might be able to extract the script from the HTML page, hand it a a fake document object and have the script write to that (provided it doesn't try to do any events, or read from or write from the DOM or anything like that) and then have that document object return its content to you.

Then you will have to figure out where the written pieces go in your HTML form, pass it into WWW::Mechanize, convince WWW::Mechanize the page you've just created is actually located on a remote server (not that hard, probably) and submit the form.

Repeat until you've reached the last page.

Actually, what you want is complete automated browser. I hear IE can be controlled via OLE or something like that. I don't know how well that works. I'm not familiar with any automation options for mozilla.

updated: fixed some typos


In reply to Re: more screen scraping with embedded Javascript by Joost
in thread more screen scraping with embedded Javascript by geektron

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.