Re: file comparison

You could compute and compare hash-digests (e.g. Digest::MD5) of files having identical size. The probability that two different files have the same digest is very low. If you decide for a bit-by-bit comparison, you should do that on rather big chunks of data. See sysread to get an idea.

Update: As a measure of precaution, you should also take care to really read the files from disk or network but not from the OS's file cache. Don't know how to do that using Windows, though. *nix has sync

Update2: As a response to this node below: You could try something along...

use strict;
use Digest::MD5;

sub get_md5 {
  my $file = shift;

  open (my $fh, '<', $file) or die "cannot open $file - $1";
  binmode($fh);

  my $md5 = Digest::MD5->new;
  $md5->addfile($fh);

  close($fh) or die "cannot close $file - $!";

  return $md5->hexdigest; # TODO: think about caching results...
}


sub files_equal_by_md5 {
  my ($file1, $file2) = @_;

  # files differ in size? 
  return 0 if (-s $file1 != -s $file2);

  my $digest1 = get_md5($file1);
  my $digest2 = get_md5($file2);

  return $digest1 eq $digest2 ? 1 : 0;
}

die "usage: $0 file1 file2\n compares file1 and file 2\n" unless @ARGV
+==2;
print files_equal_by_md5($ARGV[0],$ARGV[1])
  ? "files are equal"
  : "different files",
  "\n";
[download]

HTH

Comment on Re: file comparison Download Code


Just another Perl shrine
	PerlMonks