asw has asked for the wisdom of the Perl Monks concerning the following question:

Hello. New here. I'm writing a script to download data via FTP. The script will run in cron every 5 minutes or so, hence the transfer.lock mechanism.

.

My question is, is there a more efficient way to write this? Should I contain some of these things in subs? Here's the code:

#!/usr/bin/perl -w use strict; use Net::FTP; my $dir = "/tmp"; my $host = "host_with_most"; my $user = "some_user"; my $password = "cool_edgy_pass"; my $ftp_dir = "some/dir"; my $ftp_file_mtime; my %ftp_file_mtime_hash = (); my @ftp_file_lst_sorted; my $oldest_file; my $lock_dir = "/tmp"; my $dir_handle; my @tmp_file_list; my $lock_file = "transfer.lock"; my $start_size_file; my $running_size_file; my $ftp_size_check; my $ftp_file_size; ## changing to the directory from which we will download our file chdir ($dir); ## opening /tmp directory and reading in file list to array -- ## prob +ably more efficient way of doing this... opendir ($dir_handle, $lock_dir) or die("Can't open $lock_dir"); @tmp_file_list = grep { (!/^\./) && -f "$lock_dir/$_" } readdir($dir_h +andle); closedir ($dir_handle); ## iterating through array to test elements in list against the ##pres +ence of the lock file. If present, exit. If the lock file ## isn't pr +esent then create one and proceed with file transfer foreach (@tmp_file_list) { if ($_ eq $lock_file) { die("Transfer_helper is busy."); } } system ("touch /tmp/transfer.lock"); ## creating FTP connection my $ftp_conn = Net::FTP->new($host) or die("Can't connect to $host"); $ftp_conn->login($user, $password) or die("$user can't login"); ## changing the working directory $ftp_conn->cwd($ftp_dir) or die("can't change to FTP dir $ftp_dir"); ## directory listing is obtained with 'ls' command in FTP shell my @ftp_dir_file_lst = $ftp_conn->ls; ## iterate through the file list ## look at the mtime in epoch seconds for each file ## create a hash as name_of_file => mtime_in_seconds foreach (@ftp_dir_file_lst) { $ftp_file_mtime = $ftp_conn->mdtm($_) or die("No files in directory"); $ftp_file_mtime_hash{$_} = $ftp_file_mtime; } ## sort the hash by value @ftp_file_lst_sorted = sort {$ftp_file_mtime_hash{$b} <=> $ftp_file_mt +ime_hash{$a}} keys %ftp_file_mtime_hash; $oldest_file = $ftp_file_lst_sorted[$#ftp_file_lst_sorted]; ## get the size of the file we want to download. Wait 10 ##seconds; if + the ftp_file_size doesn't equal the the ##ftp_size_check, sleep ## this is to mitigate downloading of incomplete files $ftp_file_size = $ftp_conn->size($oldest_file); sleep(15); $ftp_size_check = $ftp_conn->size($oldest_file); while ($ftp_size_check != $ftp_file_size) { sleep(1); } ## compare start_file_size with running_file_size. Unlink ##transfer.l +ock when running == start size. Exit program. $start_size_file = $ftp_conn->size($oldest_file); $ftp_conn->get($oldest_file) or die("Can't get file $oldest_file"); $running_size_file = (stat('/tmp/' . $oldest_file))[7]; while ($running_size_file < $start_size_file) { sleep(10); } unlink('/tmp/' . $lock_file);

thanks in advance

Replies are listed 'Best First'.
Re: improve FTP script
by Illuminatus (Curate) on Jun 14, 2012 at 18:01 UTC
    Well, one thing that comes to mind is that you 'die' in a lot of places, but only unlink the lockfile on success. If you run into a transient problem and you want the script to continue running after it clears, you should probably replace die with a function that removes the lockfile then dies

    fnord

      thanks for the reply. yeah, that is true re: "die". I was looking it over again, and ran into a possible problem:

      if (scalar(@tmp_file_list) == 0) { die("the directory is empty!"); } else { foreach (@tmp_file_list) { if ($_ eq $lock_file) { die("Transfer_helper is busy."); } } }

      -- if there is are no files in the dir, the script should exit (and not write the lock file.

      I like the idea of creating a sub for managing the lock file. thanks

Re: improve FTP script
by Anonymous Monk on Jun 14, 2012 at 18:29 UTC
    Don't use Perl and FTP. Use NFS or rsync or cp --update to mirror directories over a network. Don't use cron. How do you know it will take 5 minutes? Networks have transient disruptions and congestion.