in reply to Re^3: screen scraping google
in thread screen scraping google
Spiders which obey robots.txt though. All they're asking is that others do the same
Belden# http://www.google.com/robots.txt User-agent: * Disallow: /search Disallow: /groups Disallow: /images Disallow: /catalogs Disallow: /catalog_list Disallow: /news Disallow: /pagead/ Disallow: /relpage/ Disallow: /imgres Disallow: /keyword/ Disallow: /u/ Disallow: /univ/ Disallow: /cobrand Disallow: /custom Disallow: /advanced_group_search Disallow: /advanced_search Disallow: /googlesite Disallow: /preferences Disallow: /setprefs Disallow: /swr Disallow: /url Disallow: /wml Disallow: /hws Disallow: /bsd? Disallow: /linux? Disallow: /mac? Disallow: /microsoft? Disallow: /unclesam? Disallow: /answers/search?q= Disallow: /local Disallow: /froogle? Disallow: /froogle_
|
|---|