"AFAIK, it is entirely up to the server and the applications running on it to decide how and if a directory is rendered at any given URL. Just because I can get a document at http://someurl.com/documents/mydoc.txt doesn't mean I can get a directory at http://someurl.com/documents/. I might."
I inferred from his question (where he said, "http - standard apache index") that this was already not a problem. That he has a particular directory in mind which has a known directory listing format.
"Which module would you suggest for reliably getting the contents of a remote directory via HTTP?"
Personally I'd use Web::Magic, but I'm biased.
use 5.010;
use strict;
use PerlX::MethodCallWithBlock;
use Path::Class qw(file dir);
use Web::Magic -sub => 'web';
use XML::LibXML 2.0000;
my $listing = URI->new('http://buzzword.org.uk/2012/');
my $destination = dir('/home/tai/tmp/downloaded/');
# Make sure destination directory exists.
$destination->mkpath;
web($listing)
# Die if 404 or some other error
-> assert_success
# Find all the links on the page
-> querySelectorAll('a[href]')
# Skip uninteresting links
-> grep {
not (
/Parent Directory/
or $_->{href} =~ m{\?} # has a query
or $_->{href} =~ m{/$} # ends in slash
)
}
# Expand relative URI references to absolute URIs
-> map {
URI->new_abs($_->{href}, $listing)
}
# Save each to the destination directory
-> foreach {
# Figure out name of file to save as
my $filename = $destination->file( [$_->path_segments]->[-1] )
+;
# Log a message
printf STDERR "Saving <%s> to '%s'\n", $_, $filename;
# Save it!
web($_)->save_as("$filename");
}
perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
|