in reply to Win32::IE::Mechanize not getting correct content

I've recently been playing around with AJAX (hasn't everyone?) and something annoying that I have noticed is the following:

If you load an HTML page into IE that contains a div tag, like this:

<div id="ajaxstuff" >Hello World!</div>

and a javascript function that somehow changes this div's content, like this:

document.getElementById("ajaxstuff").innerHTML = 'Howdy!';

maybe this javascript is inside a function that gets run when the page loads:

<body onLoad="init();">

or maybe it gets run as the result of an event, like clicking a button somewhere on the page - it doesn't seem to make a difference...

If you select 'view->source' from the IE browser menu anytime after this function has been run, you will see the original "Hello World!" message, not the updated "Howdy!" message - even though "Howdy!" is being displayed in the current browser window!

This doesn't really answer your question (sorry!) - but I think it's an important clue to what you're seeing: if a page is somehow modified by javascript after the initial page load, then view->source will not reflect the update, but will instead show the original HTML. So any script attempting to mechanize this will be hitting the proverbial moving target.

You might be able to 'reverse engineer' the real (updated) HTML by looking for div tags, javascript functions, and possibly URLS invoked behind the scenes (if you point your browser to these, you'll get back the raw response data used to update the page).

Of course, it may be encrypted or obfuscated, etc., but it's worth a try.

BTW - firefox seems to do the same thing. :-(

Good luck!

Replies are listed 'Best First'.
OT: debugging HTML DOM & Javascript (was: Re^2: Win32::IE::Mechanize not getting correct content)
by Joost (Canon) on Mar 16, 2007 at 16:16 UTC
    If you want to see the effect of script changes on the DOM (that is, you want to see the current document structure instead of the original HTML source) in firefox take a look at firebug. It shows live updates to the DOM and CSS (and allows you to change it), has a script-accessible javascript log (with stack traces), can evaluate user-entered javascript in the current page's context, logs XMLHTTP requests and more. It really is the most useful tool I know for javascript (and HTML) debugging.