in reply to Yet Another Scraping Question

It sounds to me like you've already answered your own question:

the last page just reloads and the URL doesn't change

If the URL doesn't change, then it will be the same as the current page, right?

If there are other subtleties to this that weren't mentioned in the OP, please elaborate. This approach should be possible, and it will probably be more reliable than trying to match content (especially if there are floating ads or other dynamic content in the page).

Update: If you want to track what is being sent when you click on a link, you can use Mozilla's liveHTTPheaders or Ethereal.

HTH

Replies are listed 'Best First'.
Re^2: Yet Another Scraping Question
by Cody Pendant (Prior) on Apr 18, 2006 at 01:45 UTC
    If there are other subtleties to this that weren't mentioned in the OP, please elaborate.

    Sorry, I wasn't clear. The URL never changes. It's always "company.com/showdata.asp", and the "next" link always points to "company.com/showdata.asp?move=nextpage". Obviously something is happening in the background whereby the fact I was on page 1 before is stored in a session or something, and the nextpage value of "2" calculated in some way that I can't see in either URL.



    ($_='kkvvttuu bbooppuuiiffss qqffssmm iibbddllffss')
    =~y~b-v~a-z~s; print
Re^2: Yet Another Scraping Question
by Cody Pendant (Prior) on Apr 18, 2006 at 02:09 UTC
    Good point about the HTTP headers, but honestly, what's sent is what I showed you, and there's nothing going on this side of the server which shows me what the current or next page is.


    ($_='kkvvttuu bbooppuuiiffss qqffssmm iibbddllffss')
    =~y~b-v~a-z~s; print