in reply to (OT) Robots disallow

The robots or crawlers are free to fully disregard the robots.txt directives. Certainly that is not nice, but the world is full of less than nice people (and robots and crawlers and ...)

I would not care much of this. If you do not want to know the world about the info on your web-site, then don't publish it where everyone can see it or put it behind a password protection.

Some more info can be found at The Web Robots pages.

CountZero

A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Replies are listed 'Best First'.
Re^2: (OT) Robots disallow
by ikegami (Patriarch) on Apr 01, 2009 at 16:24 UTC

    The robots or crawlers are free to fully disregard the robots.txt directives. Certainly that is not nice, but the world is full of less than nice people (and robots and crawlers and ...)

    And conversely, there are less than nice web sites blocking robots for no reason.