Re: Perl script to retrieve a webpage using perl

The easiest way to retrieve a webpage using Perl is to use the LWP::Simple module :

use strict;
use LWP::Simple;

my $page = get 'http://www.example.com';
print $page;
[download]

Another way is to use the wget executable, if you have it installed :

use strict;

my $url = 'http://www.example.com';

my $page = `wget -q -O - "$url"`;
print $page;
[download]

If you want even more interaction with the page, take a look at WWW::Mechanize. If you want to parse the page after retrieving it to extract data, take a look at HTML::TableExtract and/or HTML::Parser.

perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ;    # The  
$d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider
($c = $d->accept())->get_request(); $c->send_response( new   #in the
HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' #  web
[download]

Comment on Re: Perl script to retrieve a webpage using perl Select or Download Code

Replies are listed 'Best First'.
Re: Re: Perl script to retrieve a webpage using perl by liz (Monsignor) on Jul 20, 2003 at 12:30 UTC
my $url = 'http://www.example.com'; my $page = `wget -q -O - "$url"`; This way it is ok. But note that if the contents of $url comes from an untrusted source (e.g. a field in a form or part of a URL), then simply calling wget with the parameter listed, is very dangerous. Consider what would happen if $url would be '"; find /"'. Then consider what would happen if someone would call a program other than "find". Liz	[reply]
Re: Re: Re: Perl script to retrieve a webpage using perl by sgifford (Prior) on Jul 20, 2003 at 15:58 UTC
You can make the use of wget secure by using the shell's quoting mechanism and environment variables. Read more... (592 Bytes) You can also use the `open(WGET,"\|-")` construct with `exec` to do this safely. Read more... (828 Bytes) I agree in general that this is a less safe approach, but if it's the only option it can be done safely.	[reply] [d/l] [select]