Greetings,
Adding the infinitely advisable:
use strict;
use warnings;
To your sample yields:
Variable "@imgs" will not stay shared at par.pl line 17 (#1)
Drilling further down:
C:\TEMP>perl -Mdiagnostics par.pl
perl -Mdiagnostics par.pl
Variable "@imgs" will not stay shared at par.pl line 17 (#1)
(W closure) An inner (nested) named subroutine is referencing a
lexical variable defined in an outer subroutine.
When the inner subroutine is called, it will probably see the valu
+e of
the outer subroutine's variable as it was before and during the *f
+irst*
call to the outer subroutine; in this case, after the first call t
+o the
outer subroutine is complete, the inner and outer subroutines will
+ no
longer share a common value for the variable. In other words, the
variable will no longer be shared.
Furthermore, if the outer subroutine is anonymous and references a
lexical variable outside itself, then the outer and inner subrouti
+nes
will never share the given variable.
This problem can usually be solved by making the inner subroutine
anonymous, using the sub {} syntax. When inner anonymous subs tha
+t
reference variables in outer subroutines are called or referenced,
+ they
are automatically rebound to the current values of such variables.
I could not have said it better myself. :)
Hence the working version:
use strict;
use warnings;
use HTML::LinkExtor;
use LWP::UserAgent;
use URI::URL;
sub parsedocument
{
my ($url) = @_;
my $ua = LWP::UserAgent->new;
$ua->env_proxy();
# Set up a callback that collect image links
my @imgs = ();
my $callback = sub {
my($tag, %attr) = @_;
return if $tag ne 'img'; # we only look closer at <img ...>
push(@imgs, values %attr);
};
my $p = HTML::LinkExtor->new($callback);
# Request document and parse it as it arrives
my $res = $ua->request(HTTP::Request->new(GET => $url),
sub {$p->parse($_[0])});
# Expand all image URLs to absolute ones
my $base = $res->base;
@imgs = map { $_ = url($_, $base)->abs; } @imgs;
# Print them out
print join("<br>", @imgs), "<br>";
}
map {parsedocument($_) } @ARGV;
Note that your sample's original (from the documentation of HTML::Linkextor) works
exactly because it occurs in the program's main. When you wrap it in a sub, you get the problem you described, which is wellknown - for instance - to people trying to use mod_perl and Apache::Registry.
Cheers,
alf
You can't have everything: where would you put it?