in reply to Re: Web Robot
in thread Web Robot

You'll note that most major search engines violate the majority, if not all, of these rules. So don't take them too seriously.

Don't let that stop you from playing nice though :)

Replies are listed 'Best First'.
Re: Re: Re: Web Robot
by schumi (Hermit) on Jul 17, 2003 at 08:53 UTC
    Most major search engines don't stick to those rules, true. That means that if you use a robots.txt on your site, be aware of that.

    But I think, just because some big companies/search engines don't stick to the rules doesn't mean that you should do the same. I always go by the maxime, don't do unto someone else, what you wouldn't want done to you/your site.

    Just my 2 Rappen (Swiss equivalent to cents).

    --cs

    There are nights when the wolves are silent and only the moon howls. - George Carlin

      On the topic of robots.txt, why would someone even use this? If you don't want a page accessed, limit access to it. Depending on all computers to play nice isn't a very smart move, they have many hidden motives :)

        Quite true, although most major search engines do actually heed the robots-file, if it is setup properly.

        I think the easiest way to restrict access to a directory is setting up a proper .htaccess-file. You could even restrict access by IP-addresses...

        On the other hand, using a robots-file (in addition to the above, note!) decreases the amount of 404s in your error-logs... ;-)

        --cs

        There are nights when the wolves are silent and only the moon howls. - George Carlin

Re: Re: Re: Web Robot
by Jenda (Abbot) on Jul 17, 2003 at 11:23 UTC

    The question is ... does this really matter? I mean if your pages are indexed they are more likely to be found, therefore you get more hits, more ad views and in the end more money. So I would not care that much if a search engine flooded my server with requests once a month.

    So IMHO the only violation that might matter is not obeying robots.txt. Actually could someone give me some example of a reasonable robots.txt usage?

    Jenda
    Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
       -- Rick Osborne

    Edit by castaway: Closed small tag in signature