in reply to www::mechanize check if webpage has updated

Store the time you made the latest retrieval. Issue a HEAD request with the 'If-Modified-Since' header set. Don't not do a retrieval if the response is a '304 Not Modified'.

Of course, a webserver may not honor the 'If-Modified-Since' header, or send a 200 anyway. There's no guaranteed way of finding out whether a page has changed without retrieving it.

  • Comment on Re: www::mechanize check if webpage has updated

Replies are listed 'Best First'.
Re^2: www::mechanize check if webpage has updated
by Anonymous Monk on Sep 16, 2010 at 12:17 UTC
    the problem is that the page I want to verify is obtained through a submit_form. So even if I can verify after the http::response object is obtained its fine. Like maybe compare with a saved version

      Instead of a saved version, just store the hash of the saved version and do nothing if the hash is the same.

      Instead of hashing the whole page, just hash the portion(s) of the content that you care about. That way different advertisements or updates to the nav links in the header/footer won't throw you off (unless you care about those sorts of changes too).