But what visual clue do you look at when reading the page that indicates to you that it's the same as the previous page? how can you tell the difference between "new data" that is "the same" as the data that was on the previous page, and "old data" that really is just a repeat of the data you've already seen?

If you, as a human, can't do that -- then there's no way your code will be able to. ... But it sounds like you've already found your answer. you can tell if the page you are looking at is the same by looking at the PRODUCTID of each row, and if there is a duplicate (or all duplicate) from the last page, then it's hte same page.


In reply to Re^3: Yet Another Scraping Question by hossman
in thread Yet Another Scraping Question by Cody Pendant

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.