in reply to LWP and Google

I tried to search "langley public library", and the URL was http://www.google.ca/search?hl=en&q=langley+public+library&meta=, that's the pattern you need.

Update:

sulfericacid is absolutely right, Thanks for pointing out my mistake! And google does not like LWP::UserAgent either (see update 2 for more), obviously it checks for bot. This works:

use IO::Socket::INET; use strict; use warnings; my $s = IO::Socket::INET->new(Proto=>"tcp", PeerAddr=>"www.google.ca", + PeerPort=>80); my $url = "GET /search?hl=en&q=langley+public+library&meta= HTTP/1.1\ +r\nHost: www.google.ca\r\n\r\n"; print $s $url; while (my $l = <$s>) { print $l; last if ($l =~ /<\/html>/); }

Update 2 ;-) Actually LWP::UserAgent also works with a little trick:

use LWP::UserAgent; use strict; use warnings; my $ua = LWP::UserAgent->new(); $ua->agent(""); my $url = "http://www.google.ca/search?hl=en&q=langley+public+library +&meta="; print $ua->get($url)->content();

Replies are listed 'Best First'.
Re^2: LWP and Google
by sulfericacid (Deacon) on Nov 08, 2004 at 07:23 UTC
    I've tried to do this a few times before as well (never actually found a solution). Even if you have the search pattern, you can't extract the contents of that page.

    I think what you need to do is setup your own bot/client in order for Google to allow you access.

    If you try the following code, you'll see you can't get back the contents..

    #!/usr/bin/perl use warnings; use strict; use LWP::Simple; my $source = get("http://www.google.ca/search?hl=en&q=langley+public+l +ibrary&meta="); print $source;


    "Age is nothing more than an inaccurate number bestowed upon us at birth as just another means for others to judge and classify us"

    sulfericacid