Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks!
I am new to Perl and I don't know what I can do to solve the following problem:
I want to contact an ftp site (namely:ftp://ftp.ncbi.nih.gov/genomes/Bacteria) and download locally all the subfolders in it, but downloading only files with the extension .faa).
For example, I want to download tha folder Bacteria (as I wrote previously) and then, recursively, download all subfolders (like Acidobacteria_bacterium_Ellin345) and only the files with extension .faa (like NC_008009.faa which is inside the Acidobacteria_bacterium_Ellin345 folder). I thought of downloading the whole ftp folder with wget -r command of Linux, but the files are rather large and will take sometime. I believe I must store all the names of the folders in a list and then open each folder and apply wget only to the files with .faa extension. I have no idea how I contact with ftp sites using Perl though...
Any hints would be grately appreciated...

Replies are listed 'Best First'.
Re: connecting to ftp sites and download certain files based on file extensions
by odha57 (Monk) on Oct 06, 2006 at 14:22 UTC
    Welcome to using Perl! Here is a basic example of using the Net::FTP module getting a list of files (in the example, they are .xml files). The program logs in, changes to a directory called bulk_download, gets the list of files, and then ftps them. At the end, it says how many files were fetched. Hope this helps!
    #!/usr/bin/perl -w use Net::FTP; # use the ftp module use strict; my (@filelist, $file, $ftp, $ftp_count); my $host = 'your ip address'; my $user = 'username'; # user name for login my $pass = 'password'; # password for login $ftp_count = 0; $ftp = Net::FTP->new($host, Debug => 0); # start an FTP session $ftp->login($user,$pass); # login $ftp->cwd("bulk_download"); # go to the bulk_dowload +directory $ftp->binary; # make sure we ftp the fi +le as binary @filelist = $ftp->ls("*.xml"); # and get the list of .xm +l files foreach $file (@filelist){ $ftp->get($file); # fetch it, ++$ftp_count; } $ftp->quit; print "For $host found $ftp_count files\n";
    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: connecting to ftp sites and download certain files based on file extensions
by philcrow (Priest) on Oct 06, 2006 at 13:47 UTC
    Sounds like you should try Net::FTP. It can connect to sites and do all the things you need to do: ls, get, etc.

    Phil