http://qs1969.pair.com?node_id=80727
Category: Utility Scripts
Author/Contact Info John M. Dlugosz http://www.dlugosz.com/tools/index.html#work_backup
Description: This is a Perl program to perform daily backups of interesting "work" files in an intelligent manner. Developed under Win32, should be OK on all platforms.
require v5.6;
use strict;
use warnings;
use File::Spec;
use File::Path;
use File::Copy;
use File::Find;
use Getopt::Long qw (:config bundling);
our $VERSION= v1.0.3;

=head AUTHOR  John M. Dlugosz - john@dlugosz.com

=head ABSTRACT

This is a Perl program to perform daily backups of interesting "work" 
+files
in an intelligent manner.  See http://www.dlugosz.com/tools/index.html
+#work_backup
for more information and the latest version.

=head HISTORY

v1.0.1 - quick fix to put fatal error messages into LOG.

v1.0.2 - major bug fix! Got comparison backwards and didn't copy chang
+ed files,
only new files.

- deal with read-only files by removing the read-only bit on the dest.
+  Uses chmod
for portability, but abstracted for easy customization if needed.

- turn on WIDE_SYSTEM_CALLS.  It doesn't break anything (e.g. File::Fi
+nd module) and
does now copy files with names outside the common OEM/ANSI code pages.

- run under ActiveState Perl 623, Perl 6.5.1.

- add filename skip rules for Source Insight and Visual Studio interme
+diate files.  They
generate errors when copying at night with programs still open, and we
+ don't need them,
so filtering out will keep clutter out of the log.  The rules should b
+e smart enough to
prevent false hits.

v1.0.3 - further abstract the copy function in anticipation of future 
+changes, add eval
block around a single file's processing.

- add argument processing, with a few implemented and help screen to d
+ocument
plans for other options.

planned for v1.1 - trap individual errors and continue with rest of fi
+les.

plans for future versions - encrypt destination files.

=head INSTRUCTIONS

Edit the values in the next section to indicate where you want the fil
+es copied to and
which directories you want copied.  Don't include a trailing slash in 
+directory names!

For the "locations", make a list of source directories you want copied
+.  The 'source' or 'docs'
string is a "rule".  'norule' is also a rule that has no special proce
+ssing but will copy all files.

You can include "rules" to skip copying of specified files or director
+ies.  Edit the "skip_directory" or
"skip_file" function to add your intelligence.  For example, 'source' 
+directories don't back up the
compiler output, and 'docs' directories skip MS Word's litter.  The "r
+ule" name is only used by
these functions, so define the names and the logic to suit.

The basic behavior of this program is to copy the files from your sour
+ce locations to the
backupdir.  It will skip files based on date, only overwriting an olde
+r file with a newer.

=cut

##########################
### Configuration section - edit this.

my $custom_message= 'Configured for JMD';

my $backupdir= '\\\\data01\\A0436084\\backup';

my @locations= (
   [ source => 'D:\\dev' ],
   [ docs => 'D:\\My Documents' ],
   [ source => 'F:\\dev' ],
   [ docs => 'F:\\Documents' ],
   [ norule => 'D:\\Program Files\\util']
   );

###########################
### "rules" for skipping files. customize or extend this, if needed.
# functions return true to skip, false to backup normally.

${^WIDE_SYSTEM_CALLS}= 1;  # comment this out on Win9X/Me if it's not 
+simply ignored.

sub skip_directory ($$)
 {
 my ($directory, $rule)= @_;
 if ($rule eq 'source') {
    return 1  if $directory =~ /Release/i;  # dirname contains "Releas
+e", includes things like "bin_Release" and "Releaseme";
    return 1  if $directory =~ /Debug/i;  # likewise
    # >>>>> add rules here.
    }
 return undef;  # don't skip.
 }

sub skip_file ($$)
 {
 my ($filename, $rule)= @_;
 if ($rule eq 'source') {
    if ($filename =~/\.(?:IAB|IMB)$/) {  # all-caps KNOWN, so no i.
       # skip IAB and IMB files if a PR file with the same basename al
+so exists.
       # these generate "file open" errors when Source Insight is runn
+ing.
       (my $othername=$filename) =~ s/\....$/\.PR/;
       return 1  if -e $othername;  # looks like Source Insight index 
+files.
       }
    if ($filename =~ /\.(?:ncb|opt)$/) { # all-lower known, so no i.
       (my $othername=$filename) =~ s/\....$/\.dsw/;
       # skip .NCB and .OPT if a .DSW exists.
       return 1  if -e $othername;  # Visual Studio project files inte
+rmediate junk.
       }
    # >>>>> add rules here.
    }
 elsif ($rule eq 'docs') {
    return 1  if $filename =~ /[\\\/]~.+\.tmp$/i;  # skip filenames li
+ke ~XXX.TMP;
    }
 return undef;  # don't skip.
 }

###########################

my $verbose= 0;  # enable with -v
my $quiet= 0;
my $nolog= 0;  # enable with -s
my $filecount= 0;

###########################

sub prep_target_file ($)
# this was introduced to deal with read-only files.  It is called on t
+he dest filename
# before the copy takes place, and may do any work needed in this situ
+ation.
 {
 return unless -e $_[0];
 chmod 0777, $_[0];
 }
 
###########################

sub check_backupdir
 {
 eval { mkpath ($backupdir); };
 if ($@) {
    die "Error: cannot verify or create directory [$backupdir]\n$@\n";
    }
 unless ($nolog) {
    my $logfile= File::Spec->catfile ($backupdir, "backup_log.txt");
    print "logging results to [$logfile]\n"  if $verbose;
    open LOG, ">> $logfile"  or die "Cannot open file [$logfile] for w
+riting.\n";
    my $time= localtime();
    print LOG "Started $time\n";
    }
 }


sub perform_copy ($$)
 {
 my ($sourcefile, $destfile) = @_;
 unless (copy ($sourcefile, $destfile)) {
    my $err= $!;
    my $err2= $^E;
    my $message= "ERROR COPYING FILE\n from $sourcefile\n to $destfile
+\n $err\n $err2\n";
    print LOG $message  unless $nolog;
    print $message;
    }
 }

sub do_backup ($$$)
 {
 my ($sourcefile, $destfile, $rule)= @_;
 if (-d $sourcefile) {
    # do the directory
    if (skip_directory ($sourcefile, $rule)) {
       $File::Find::prune= 1;  # tell find to skip the contents.
       return;
       }
    mkpath ($destfile);
    }
 else {  # is a file
    return  if skip_file ($sourcefile, $rule);
    return if -e $destfile && (-M $destfile <= -M $sourcefile);  # cop
+y is up to date.
    print "==> $sourcefile\n" if $verbose;
    ++$filecount;
    prep_target_file ($destfile);
    perform_copy ($sourcefile, $destfile);
    }
 }
 
sub process_location ($)
 {
 my ($rule, $source)= @{shift @_};
 my $dest= $source;
 # change drive letter or server name to subdir
 $dest =~ s/^(.):/Drive $1/;
 $dest =~ s/^\\\\/\\/;
 $dest= File::Spec->catdir ($backupdir, $dest);
 print "Backing up [$source] to [$dest] using rule '$rule'\n"  if $ver
+bose;
 my $re= qr/^\Q$source\E/;
 my $sub= sub {
    my $sourcefile= $File::Find::name;
    (my $destfile= $sourcefile) =~ s/$re/$dest/;
    eval {
       do_backup ($sourcefile, $destfile, $rule);
       };
    if ($@) {
       my $message= "EXCEPTION CAUGHT trying to copy from [$sourcefile
+] to [$destfile].\n$@";
       print LOG $message  unless $nolog;
       print $message;
       }
    };
 find ($sub, $source);
 print "Copied a total of $filecount files\n"  if $verbose;
 }

sub show_options
 {
 print <<"EOF";
NOTE: this is a placeholder and design docs.  Options are not
implemented yet.

Usage:
    With no parameters, performs the pre-configured backup.  Change th
+e
    configuration by editing the config section of \"$0\".

    If filenames are given, will backup (or restore) just those named 
+files
    (directory names OK).  If the named file is within the backup subd
+irectory
    ($backupdir) then the file will be restored to the original positi
+on.
    Otherwise it is taken as the name of a file to backup.

    Flags can modify this behavior.  Flags are not order-dependant and
    mean the same thing whether before, mixed with, or after filename
    arguments.

 Flags:
    -restore  Restores the named files.  Unlike the implicit restore a
+bove,
    the filename is the original location, not the backup location.
    -full  Ignore dates, always copy the file.
    -all  Ignore the skip rules and copy all files.
    -verbose  Print messages concerning program's operation.
    -output=xxx  Specify a different directory for restored files.
    -keep  Rename restored files by adding '.restored' to the name.
    -secret  Don't write to logfile.
    -quiet  Show less output.
    -nowork  Disable copy, used to try the options before doing it.
    -help  Show this message.
EOF
 exit 1;
 }
 
###########################
### "main" program

GetOptions (
   'help|?|h' => \&show_options,
   'verbose!' => \$verbose, 'v' => \$verbose,
   'quiet!' => \$quiet, 'q' => \$quiet,
   'secret!' => \$nolog, 's' => \$nolog
  )  || exit 1;
print "work_backup utility - $custom_message\n"  unless $quiet;
check_backupdir();
eval {
   if (@ARGV) {
      # process named files on the command line
      die "filename arguments not implemented yet.\n";
      }
   else {
      foreach my $item (@locations) {
         process_location ($item);
         }
      }
   my $time= localtime();
   print LOG "finished $time\n"  unless $nolog;
   print LOG "copied $filecount files\n"  unless $nolog;
   };
if ($@) {
   print LOG "ABNORMAL TERMINATION: $@"  unless $nolog;
   print $@;
   }
print LOG "======== done =========\n"  unless $nolog;
close LOG  unless $nolog;
print "copied $filecount files\n"  unless $quiet;
Replies are listed 'Best First'.
Re: Work Backup
by grinder (Bishop) on May 16, 2001 at 16:36 UTC

    I prefer

    sub printlog { print LOG @_ unless $nolog; } printlog "======== done =========\n";

    to

    print LOG "======== done =========\n" unless $nolog;

    as it makes for less clutter. It is also beneficial in that, with a little more work, $nolog doesn't have to pollute the namespace.

    BEGIN { my $nolog = 0; sub setlog { $nolog = shift; } sub printlog { print LOG @_ unless $nolog; } } setlog 1; printlog "======== done =========\n";

    I like to generalise this functionality to handle debug/production screen output and/or log file output.


    --
    g r i n d e r
      Reducing clutter and generalizing is all well and good, but does it make sense to move such a small amount of logic (unless) to a subroutine? I'm under the impression that sub calls are expensive enough to not be used indiscriminately, but not too expensive to use when necessary.
        I'm under the impression that sub calls are expensive enough to not be used indiscriminately, but not too expensive to use when necessary.

        I wouldn't say that. I would say the the factoring benefits it provides (saving programmer time) far outweigh the cost benefits (having a CPU thrash around a bit). What happens if you need to change that trivial piece of logic? One change versus a multiplicity of changes, and no guarantees that you managed to locate and change all the occurrences of the call.

        Another benefit comes in debugging. Supposing you have an entry printed to your logfile and a short time later the program blows up in a spectacular manner, and also suppose that for some reason you can't identify exactly where in the code this occurs, and/or it is too difficult to set a breakpoint in your own code. This happens a lot when you have a dynamic program that creates new code on the fly as it runs. The easiest solution would be to put a conditional breakpoint in the printlog routine, and then follow the thread of execution back into the main code.

        Can you honestly say that you had a Perl program that was too slow, and the reason for the poor performance was due to excessive subroutine calls? What I do know is that a program with a fine subroutine granularity will lend itself admirably to Devel::DProf which will let you identify the exact points in your code where CPU cycles are being burnt.

        Don't avoid using subs just because you've heard they are "slow." This runs against Donald Knuth's dictum "Premature optimization is the root of all evil". Go read A Tirade Against the Cult of Performance.


        --
        g r i n d e r
      Thanks for the input.

      I find the print LOG / unless to be just fine in this case, but agree that if generalized to several forms of output and modes the seperate sub would be superior. Right now, I have some duplication for things that are printed and logged. However, they are not identical — the log is more formal and needs more context.

      —John