in reply to Cannot retrieve HTML for some pages with LWP
However, you may be interested in the Google web APIs, for which there are modules (Net::Google and DBD::Google) on CPAN.#!/usr/bin/perl use LWP::UserAgent; use strict; my $url = "http://scholar.google.com/scholar?hl=en&lr=&q=machine+learning"; my $ua = LWP::UserAgent->new; $ua->env_proxy; $ua->agent("Mozilla/5.0 (Windows)"); my $response = $ua->get($url); if ($response->is_success) { print $response->content; } else { die $response->status_line; }
Also, if you are interested in just getting the text of a web page, you may find it easier to use "lynx -dump" than perl. You can use it under cygwin on Windows.
|
|---|