in reply to Checking external links for inappropriate content

Note: I'm not saying that you're wrong. At all.

I'd be inclined to take a different tack on this.

Script to check the last modfied date/time on external links, and if changed since last checked give the list for a human to check.

Of course, there are issues with getting the last modified accurately, but I imagine that they're more solvable than parsing for content.

Perhaps each page has a certain string you can check for to make sure it's unchanged?

Hope the different viewpoint helps.

Malach
So, this baby seal walks into a club.....

  • Comment on Re: Checking external links for inappropriate content

Replies are listed 'Best First'.
Re: Re: Checking external links for inappropriate content
by joealba (Hermit) on Feb 14, 2002 at 23:13 UTC
    That's another good idea, but most (if not all) of our external links will be updated quite often. We just don't have the manpower to check every link every time it is updated.

    Besides... why have people do the work that a few well-planned regexps can do? :) Thanks, though!
      Well... you could however have your script only check those pages that changed... and as a safety net check pages that haven't reported changes on a less frequent basis... to make it so your script doesn't run forever.

                      - Ant
                      - Some of my best work - (1 2 3)