Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Check if your site's been banned with Google

by merlyn (Sage)
on Jan 08, 2007 at 14:45 UTC ( [id://593544]=note: print w/replies, xml ) Need Help??


in reply to Check if your site's been banned with Google

Google does not permit you to screen-scrape Google in this manner.

Please use the Google API, conveniently wrapped in Net::Google.

Replies are listed 'Best First'.
Re^2: Check if your site's been banned with Google
by Joost (Canon) on Jan 08, 2007 at 14:55 UTC

      Oddly enough, the Google AJAX API FAQ lists 15 questions, but only contains 7 answers.

      From a previous reading of the rules of use, you specifically were NOT to use it on anything other than a website, and you were not allowed to do anything other than present the information directly as returned by Google. ... unfortunately, answers #9 and #11 aren't listed right now.

      (of course ... would it then be ethical to scrape that website that you created?)

      Update: 9 and 11, not 8 and 11.

Re^2: Check if your site's been banned with Google
by eric256 (Parson) on Jan 10, 2007 at 01:48 UTC

    Is it truly legal for a site to tell you how to use content they provide on the internet? If he was planning to redistribute the info on his own website I could understand, but for a personal command line tool? Isn't that kind of like saying you have to READ the whole page of HTML we send you, you can't just skim it to see if your site worked?

    I'm not saying it is ethical, I'm just curious as to how far googles reach extends over the content it provides. If it were a site I had to register and agree to it's terms of use i could understand that, but this is a case of limited the use of information that is made public by google on purpose. What if I made a GreaseMonkey script that does the same thing and displays it in my browser? Where does the line get drawn? Am I required to view their entire page of HTML based on terms of use that I might not know exist let alone agree to? Could there terms of use then ban me from using information off a search in any other context? Could it state that i must fully read at least one ad before looking to see if my site was among the other sites listed?

    Like I said, I can understand limits on uses of information that you have to register to see or that you plan on reusing for your own profit, but this doesn't seem to fit either of those casses so I'm curious. Just some food for thought, and maybe there is an obvious answer out there that i'm not aware of.


    ___________
    Eric Hodges

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://593544]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (3)
As of 2024-03-28 13:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found