Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: LWP post

by smitz (Chaplain)
on Jun 18, 2002 at 15:46 UTC ( [id://175398]=note: print w/replies, xml ) Need Help??


in reply to LWP post

Bit OT, but you should probably be aware that Google dont like this sort of thing. Many /WWW::Google/ type module authors have had requests for their code to be removed.
If you want to do google searches from within your code, consider taking a look at Google API.

SMiTZ

Replies are listed 'Best First'.
Re: Re: LWP post
by moxliukas (Curate) on Jun 18, 2002 at 16:03 UTC
    Yes, this is true. Google does not like these things. However I do have a small script that uses GET method that could be of some use to the person who asked the question. Use it only for learning and as smitz said, have a look at Google API.
    use URI; use LWP::UserAgent; my $what_i_want_to_find = 'perl monks'; $ua = LWP::UserAgent->new; # google doesn't like it when user agent is libwww perl, so # I change it to Mozilla/5.0 $ua->agent('Mozilla/5.0'); my $uri = URI::new; $uri->query_form('q'=>$what_i_want_to_find); my $url = 'http://www.google.com/search' . $uri->as_string; my $req = HTTP::Request->new(GET => $url); my $res = $ua->request($req)->as_string; $res =~ /\<p\>\<a href\=([^>]+)\>/; print $1;
    I hope this was useful.
      Thanks moxliukas, I now understand the Get method, Do you know if the Post method works in a similar fashion when trying to automate the submit button on a form?
        I am not sure how POST works, maybe some other PerlMonks could provide this information... I have used HTML::Form for form submission previouly, so have a look at that, it is pretty easy to understand.
        Check out the second bit of sample code in this node to see how to set up a POST request.
Re: Re: LWP post
by kappa (Chaplain) on Jun 18, 2002 at 16:33 UTC
    I'd like to add that google is likely to use some defense mechanisms against automatic queries via HTTP. WWW::Automate tests itself against http://google.com/ page and I get lots of failures described as "Connection reset by peer" when running several test sessions in sequence.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://175398]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (2)
As of 2024-04-26 04:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found