Klaas has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I'm very new to perl and somewhat new to programming, and I'd like to use perl to read 10.000s of files from a ftp server and run test on those files on my own computer and save the results. The problem is: I can connect to a fileserver, but I cannot change the subdirectory and make Perl find the files I want for me. My snipped of code I use for finding the files is obviously very flawed, but my head is exploding and I ran out of coffee.

Thank you!
use strict; use warnings; use File::Find; use Net::FTP; use Cwd; my $ftp = Net::FTP->new("ftp.ncbi.nih.gov", Timeout => 30) or die "Can +not connect to server: $@"; my $dir = "/genomes/Bacteria"; $ftp->cwd($dir) or die "Cannot cd to " , $dir; my @directories = $ftp->ls() or die "cannot list any DIRs"; my @files; find( sub { push @files, $File::Find::name if /\.fna$/ }, @directories + ) or die "can't find shit"; for(my $i=0; $i<@files; $i++){ print "$i","x hooray" } $ftp->quit;
EDIT: working code, if anyone is interested:
use warnings; use strict; use Net::FTP; my $ftp = Net::FTP->new("ftp.ncbi.nih.gov", Timeout => 30) or die "Can +not connect to server: $@"; ## login is mandatory, even if no user and password is specified: $ftp->login("anonymous",'-anonymous@') or die "Cannot login ", $ftp->m +essage; my $dir = '/genomes/Bacteria'; ## move to the subdirectory: $ftp->cwd($dir) or die "Cannot cd to " , $ftp->message(); #print "part 1\n"; my @dir_listing = $ftp->ls() or die $ftp->message; ## note: you need to save DIRs as arrays before you can search for spe +cific ## files or anything my @files; for(my $j=0; $j<@dir_listing; $j++){ ##list all the files in subdirs: my @file_list = $ftp-> ls($dir_listing[$j]) or die $ftp->message; ##weed out all the files you want and put them in an array my @found_files = grep(/.fna/, @file_list); push (@files, @found_files); } ##Make a resultfile to print all the files to: my $results = "D:/Genomes/results/filelist.dat"; + open (OUTFILE, '>', $results) or die "Cannot write $!n"; for(my $i=0; $i<@files; $i++){ print OUTFILE "@files[$i]"; } close (OUTFILE); $ftp->quit;

Replies are listed 'Best First'.
Re: New to Perl: Finding files on FTP
by muppetjones (Novice) on Mar 14, 2012 at 17:13 UTC

    You don't even need to change the directory. Give this a shot -- it stores the contents of a directory and removes non-files

    my $dir = '/genomes/Bacteria'; # use single quotes where possible, e.g., no var or \n # it's a tiny bit faster ## get the contents of a directory my @dir_listing = <$dir/*>; ## note: using <$dir/*/> would find all the directories ## removes . and .. while ($dir_listing[0] =~ /^\.+$/) { shift @{$dir_listing}; } ## removes all directories my @files_found = map { (-d $_) ? () : $_ } @dir_listing;

      Thanks! You are right, I don't need to change the directory. Let me experiment with your code a bit, need to understand the map function and get a better grasp of the

      while ($dir_listing[0] =~ /^\.+$/) { shift @{$dir_listing}; }
      I guess I would do this with a for loop, so I can just shift $dir_listing at position i. And I need to search all the subdirs as well, but I think I'll find a way. Thanks again!
        If you can't change to the directory, you probably can't list the directory or get files from the directory either. Add $ftp->message() to all the "or die" messages after the connect to get the reason for failure (on connect, $@ gives the reason). E.g.:
        ... or die "cannot list any DIRs " . $ftp->message();

        So...I made a mistake. I generally prefer to use array and reference hashes, so shift @{$dir_listing} was out of habit. However, the code I gave actually declared an array so, shift @dir_listing;is the correct way to do it.

        Regarding your comments, the while loop looks at the first element in the array and checks to see if it is just a series of dots using a regex, i.e., either '.' or '..'. If so, it removes the first element and checks again.

        For map, you can think of it as a very specialized foreach loop (map { <expr> } <array>) that returns values depending on the code used in the brackets. However, it is a little slower than a foreach loop so you generally only want to use it when you're using its output. It's a great way to modify or filter out values from an array.