Re: web page source?

To get the source, I'd get LWP::Simple from CPAN. The code to get your source would then be a simple 2-liner:

use LWP::Simple;
my $source = get("http://whatever.url.you/want/to/view.html");
[download]

You only need the "use" directive once in your program; use the get() command every time you need to get the source of a page.

Writing an HTML parser by hand is very non-trivial... I'd look at HTML::Parser (again, at CPAN) and see if that'll make your life easier. I've not really used HTML::Parser before, but, by looking at the documentation and playing around for the last 15 minutes, it appears you'd want to do something like the following:

#!/usr/bin/perl -w

use strict;
use LWP::Simple;
use HTML::Parser;

my $source = get("http://www.perlmonks.org");

my $parser = HTML::Parser->new();
$parser->handler( start => \&function, 'token0, attr');
$parser->parse($source);

sub function {
  my ($tag_name, $attr_ref) = @_;
  if ($tag_name eq 'a') {
    my %attr = %$attr_ref;
    print $attr{href}, "\n";
  }
}
[download]

Comment on Re: web page source? Select or Download Code


good chemistry is complicated, and a little bit messy -LW
	PerlMonks