in reply to Web Robot

For me, there's just one "don't":

From that rule, many others can be deduced:

Remember that your robot will be a guest in other peoples territories. Act accordingly.

Abigail

Replies are listed 'Best First'.
Re: Re: Web Robot
by Anonymous Monk on Jul 17, 2003 at 01:58 UTC

    You'll note that most major search engines violate the majority, if not all, of these rules. So don't take them too seriously.

    Don't let that stop you from playing nice though :)

      Most major search engines don't stick to those rules, true. That means that if you use a robots.txt on your site, be aware of that.

      But I think, just because some big companies/search engines don't stick to the rules doesn't mean that you should do the same. I always go by the maxime, don't do unto someone else, what you wouldn't want done to you/your site.

      Just my 2 Rappen (Swiss equivalent to cents).

      --cs

      There are nights when the wolves are silent and only the moon howls. - George Carlin

        On the topic of robots.txt, why would someone even use this? If you don't want a page accessed, limit access to it. Depending on all computers to play nice isn't a very smart move, they have many hidden motives :)

      The question is ... does this really matter? I mean if your pages are indexed they are more likely to be found, therefore you get more hits, more ad views and in the end more money. So I would not care that much if a search engine flooded my server with requests once a month.

      So IMHO the only violation that might matter is not obeying robots.txt. Actually could someone give me some example of a reasonable robots.txt usage?

      Jenda
      Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
         -- Rick Osborne

      Edit by castaway: Closed small tag in signature