perl_help26 has asked for the wisdom of the Perl Monks concerning the following question:

Hello , I have a remote server with files i need to process every two hours. How can i only sftp the files that were overwritten after 2 hours. Should i use checksum of files or mtime? and if so how ? I need the faster way. Thanks

Replies are listed 'Best First'.
Re: copy only modified remote files
by salva (Canon) on Jul 07, 2015 at 15:01 UTC
Re: copy only modified remote files
by jeffenstein (Hermit) on Jul 07, 2015 at 14:14 UTC

    Do you have full shell access to the remote server? If so, maybe consider using rsync to synchronize the changed files. It has options to use the mtime or use checksums.

    If you don't have full shell access, then Net::SFTP is an option. However, the SFTP protocol won't let you do remote checksums, so you are limited to only using the mtime of the remote file.

Re: copy only modified remote files
by stevieb (Canon) on Jul 07, 2015 at 13:43 UTC

    You've provided no code or haven't showed anything that you've tried, but there are several ways to do this. I'll leave it up to you to show us the code for the actual file transfers and logging into the remote system, but here's one way to check for last modification time.

    #!/usr/bin/perl use strict; use warnings; my $file = "this.file"; if (-f $file){ my $mtime = (stat($file))[9]; my $modified_ago = time() - $mtime; my $modified_ago_minutes = sprintf("%.2d", $modified_ago / 60); if ($modified_ago_minutes > 120){ print "$file\n"; } }

    -stevieb

      Thank you for your reply. I am just trying to understand which is more reliable/fast: checksum or mtime. The idea is that I want to check if the remote file was modified, before i copy it locally (to save up time). Also, is it possible to do checksum of the remote file before copying it locally? Thanks
        mtime will be faster in most if not all cases, much faster if your files are large.

        First, yes you can check on the remote system and there are a few ways you can do that. One is SSH into the remote machine, run a perl one-liner (or remote script) on the remote box to report back the mtime (or checksum).

        Getting a checksum would likely be quicker, but consider that you'll have to store the checksums somewhere then look them back up every program run for comparison, so that's likely to even things out.

        EDIT: per Laurent_R, mtime will be faster in the long run. I was testing on a miniscule few-byte file.

        -stevieb

A reply falls below the community's threshold of quality. You may see it by logging in.