I found this site The Web Robots Pages useful for sysadmins and would-be Web robot programmers, who somehow don't happen to know what robots.txt is.
Anyone knows how common or uncommon robots.txt is?
Or anyone would like to share any Do's and Dont's about writing a Web robot (or anything that programmatically fetches something for you via the Web)? It seems rather common that many people did not specify the "agent" (whose default value is "libwww-perl/#.##") when using LWP::UserAgent. It may or may not matter, depending on the sites your script or robot is visiting.| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |