in reply to Re: How to extract links from a webpage and store them in a mysql database
in thread How to extract links from a webpage and store them in a mysql database

This node falls below the community's threshold of quality. You may see it by logging in.
  • Comment on Re^2: How to extract links from a webpage and store them in a mysql database

Replies are listed 'Best First'.
Re^3: How to extract links from a webpage and store them in a mysql database
by g0n (Priest) on Dec 06, 2006 at 12:36 UTC
    Step one is probably to write an algorithm to do what you want. Something like this perhaps:

    • Create your database table with columns for 'link', 'depth', 'read'
    • read the first page and store the base URL
    • for each link in the page, compare its base to the original base URL
    • If they match, add to the DB with depth 2 and read 'no'
    • For each entry in the table where read eq 'no', read the page, set read to 'yes', compare each link base to the original base URL
    • If they match, add to the db with depth 3 and read 'no'
    • repeat the last two steps, setting depth to 4 (i.e. a link found at depth 3)
    • end
    You could end when you don't find any entries in the db with depth <=3 and read eq 'no', that way it's easy to modify if you decide to read deeper.

    --------------------------------------------------------------

    "If there is such a phenomenon as absolute evil, it consists in treating another human being as a thing."
    John Brunner, "The Shockwave Rider".

    A reply falls below the community's threshold of quality. You may see it by logging in.
    A reply falls below the community's threshold of quality. You may see it by logging in.