in reply to scp cronjob

My best guess is that the $list file is getting over-written by multiple copies of the Perl script, when your bandwidth limits cause one copy to hang long enough for another copy to start.

Here are some other thoughts:

  • You don't need the $list file. Use backticks instead:
    @backupFiles = `ssh user\@backupserver ls /usr/local/apache/htdocs`;
  • You are using File::Find and "globbing" (</path/*.htm>) together. This is redundant; use one or the other.
  • Do you really want to back up only the new files, but not any changed files? This confuses me.
  • ls -c is an odd command for this program; you are asking ls to sort by ctime, then throwing away the sort order by using a hash. Plain ls would be clearer.
  • Rather than use cron, you might just have the Perl script run as a daemon, never ending until you kill it. This would prevent the multi-copy problem, too.
    while (1) { do_something(); sleep 60; }
  • If bandwidth is a problem, keep a flagfile whoses modtime is equal to the time of the last scp transfer. If no local file is newer than the flagfile, you can skip the ssh traffic.
  • You are calling scp once for each file; instead, you could build a list of files and call scp once:
    my @files = map { "'$_'" } grep {not $chompedList{basename($_)} </usr/myfiles/*.htm>; system "scp -C @files user\@backupserver:/usr/local/apache/htdocs";
  • You are using fileparse where basename would be clearer.
  • Finally, listen to ides. Rsync was designed for this kind of job. It has lots of options to fine-tune what gets synced and how; it can even limit its bandwidth use. Rsync should work great by itself as a cron job, but here is a (lightly tested) script to demonstrate my other points:
    #!/usr/bin/perl -W use warnings 'all'; use strict; my $rmthost = 'backupserver'; my $rmtuser = 'user'; my $rmtpath = '/usr/local/apache/htdocs'; my $lclpath = '/usr/myfiles'; my $lclglob = '*.htm'; my $flagfile = '.last_backup'; my $r_opts = "-azq --blocking-io -e 'ssh -l $rmtuser'"; my $lclfiles = "$lclpath/$lclglob"; sub modtime { my $file = shift; my @stats = stat $file or return 0; return $stats[9]; } while (1) { my $timestamp = modtime("$lclpath/$flagfile"); my $run = grep {modtime($_) > $timestamp} glob $lclfiles; if ($run) { system "rsync $r_opts $lclfiles $rmthost:$rmtpath"; open FLAG, ">$lclpath/$flagfile" or die; close FLAG or die; } sleep 60; }