Both of the replies above are good, and I would go with them first; however, I decided to toy with your question, and created the following script. If nothing else, it might give you something to play with.

A few notes regarding the code:

#!/usr/bin/perl -w use strict; use Compress::Zlib; use File::Glob ':glob'; use File::Spec; use HTTP::Request; use LWP; use LWP::Simple; use LWP::UserAgent; use Sort::Versions; my $DEBUG = 0; my $agent_id = 'CPANChecker.pl/0.1 '; # my $cpan_site_root = $ARGV[0] || 'ftp://ftp.cpan.org/pub/CPAN'; my $cpan_site_root = $ARGV[0] || 'http://www.perl.com/CPAN'; my $directory_match = qr!modules/!; my $find_ls_filename = 'indices/find-ls.gz'; my $file_match = qr!(\.tar\.gz|\.tgz)!; my $target_path = '~/reference/cpan'; $target_path = bsd_glob( $target_path, GLOB_TILDE | GLOB_ERR ); my $ua = LWP::UserAgent->new; $ua->agent($agent_id); my $req = HTTP::Request->new( GET => join( '/', $cpan_site_root, $find_ls_filename ) ); $req->header( Accept => "text/html, */*;q=0.1" ); my $res = $ua->request($req); if ( $res->is_success ) { my $find_ls_file = $res->content; my (%files); my @filelist = grep( /$directory_match/, split( /\n/, Compress::Zlib::memGunzip($find_ls_file) ) ); print( scalar(@filelist), "\n" ) if ($DEBUG); foreach ( 0 .. $#filelist ) { $filelist[$_] =~ s/\r//g; my @parts = split( /\s+/, $filelist[$_] ); my $filepath = $parts[8]; next unless $filepath =~ /^$directory_match/; next unless $filepath =~ /$file_match$/; { my ($path); my ($file); # @parts = split( /\//, $filepath ); ( undef, $path, $file ) = File::Spec->splitpath($filepath); # my $path = join( '/', @parts[ 0 .. ( $#parts - 1 ) ] ); # my $file = $parts[$#parts]; @parts = split( /-/, $file ); pop(@parts); my $module = join( '-', @parts ); push( @{ $files{$path}{$module} }, $file ); } } foreach my $k ( sort( keys(%files) ) ) { { my $pathname = $target_path; my @parts = split( /\//, $k ); foreach my $p (@parts) { # $pathname = join( '/', $pathname, $p ); $pathname = File::Spec->catfile( $pathname, $p ); if ( !-e $pathname ) { print "Creating $pathname...\n"; mkdir($pathname) or die("Error: $!\n"); } } } print( $k, "\n" ); foreach my $m ( sort( keys( %{ $files{$k} } ) ) ) { my @parts = sort( { versioncmp( $b, $a ) } @{ $files{$k}{$m} } ); # print( # 'key: ', $k, "\n", # 'module: ', $m, "\n", # "\t", $parts[0], "\n" # ) # if ($DEBUG); print( 'key: ', $k, "\n", 'module: ', $m, "\n", "\t", File::Spec->catfile( $target_path, $k, $parts[0] ), "\n" ) if ($DEBUG); $req = HTTP::Request->new( GET => join( '/', $cpan_site_root, $k, $parts[0] ) ); $req->header( Accept => "text/html, */*;q=0.1" ); # $res = # $ua->mirror( # join( '/', $cpan_site_root, $k, $parts[0] ), # join( '/', $k, $parts[0] ) ); # $res = $ua->mirror( # join( '/', $cpan_site_root, $k, $parts[0] ), # File::Spec->catfile( $k, $parts[0] ) # ); $res = $ua->mirror( join( '/', $cpan_site_root, $k, $parts[0] ), File::Spec->catfile( $target_path, $k, $parts[0] ) ); if ( is_success($res) ) { print( "...", $parts[0], "\n" ); } if ( is_error($res) ) { print( "Error in retrieving $k/$parts[0] : ", $res->status_line, "\n" ); } } } } else { print(http://www.perl.com/CPAN/ "Error in retrieving $cpan_site_root/$find_ls_filename : ", $res->status_line, "\n" ); }

Update: 06 Mar 2004: Thanks to leira's suggestions, modified the code above to use the File::Spec module for dealing with the filenames. Still may be room for improvement with it, but getting there.

Update: 06 Mar 2004: Fixed $target_path to handle relative locations (with thanks to castaway for suggesting a module that could handle ~).

Update: 15 Mar 2004: Fixed apparent error in mirror statement when using relative paths. Changed several places where paths were assembled manually to use File::Spec->catpath().

Update: 02 May 2004: Changed $cpan_site_root 's default value to point to the CPAN Multiplexer (http://www.perl.com/CPAN/).


In reply to Re: code to grab the latest version of a module from CPAN? by atcroft
in thread code to grab the latest version of a module from CPAN? by perrin

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.