in reply to MD5 Hash

Presumably you're already using Digest, since this is apparently a followup to the thread file comparison using file open in binary mode.. In the Digest documentation there is a comparison of various digest speeds, with MD4 being the fastest.

If speed is really a huge deal, you could add an additional comparison stage, like perhaps an MD5 over only the first 64K of each file. Then if those match, do an MD5 over the whole file.

Replies are listed 'Best First'.
Re^2: MD5 Hash
by Karger78 (Beadle) on Dec 01, 2009 at 20:36 UTC
    That sounds like a good idea. first i will past the code i have to make sure it's not a code flaw. What do you think?
    sub md5sum{ my $file = shift; my $digest = ""; eval{ open(FILE, $file) or die "Can't find file $file\n"; my $ctx = Digest::MD4->new; $ctx->addfile(*FILE); $digest = $ctx->hexdigest; close(FILE); }; if($@){ print $@; return ""; } return $digest; }
      use strict; use warnings; sub md4sum { my $fileName = shift; my $digest = ""; eval { open my $file, '<', $fileName or die "Can't open $fileName: $! +\n"; my $buffer; read $file, $buffer, 2**16; close ($file); my $ctx = Digest::MD4->new; $ctx->add ($buffer); $digest = $ctx->hexdigest; }; if ($@) { print $@; return ""; } return $digest; }

      Update s/2\^16/2**16/. Thanks AnomalousMonk


      True laziness is hard work
      Thanks that worked perfectly. Thanks for the suggestions everyone.
Re^2: MD5 Hash
by Karger78 (Beadle) on Dec 01, 2009 at 21:12 UTC
    I like the idea of checking the 1st 64k of the md5 hash. Could i build the hash with using the 1st 64k ?