in reply to MD 5 hash comparison/checker

Do you actually use the md5 for anything other than checking if the two files are the different? If so, you're just wasting time calculating it.
use strict; use warnings; open(my $fh1, '<', $ARGV[0]) or die $!; binmode($fh1); open(my $fh2, '<', $ARGV[1]) or die $!; binmode($fh2); for (;;) { defined(read($fh1, my $buf1='', 4096)) or die $!; defined(read($fh2, my $buf2='', 4096)) or die $!; if ($buf1 ne $buf2) { print("Different\n"); exit(1); } last if !length($buf1); } print("Same\n"); exit(0);

Replies are listed 'Best First'.
Re^2: MD 5 hash comparison/checker
by graff (Chancellor) on May 07, 2010 at 02:40 UTC
    Um... okay, if it's just a question of comparing one file to one other file to determine "same" or "different", a byte-for-byte comparison like you suggest certainly makes the most sense. Good call. (Update: Of course, just using the *n*x "cmp" utility will be a lot easier/quicker.)

    But if it were a case of looking for duplicates among a large set of files, using the md5 signatures of the files (in combination with file byte counts) will save a lot of time. (I don't know if the OP represents this sort of "XY Problem" -- talking about comparing two files when the task is actually bigger than that -- but it's worth mentioning in any case.)

      (Update: Of course, just using the *n*x "cmp" utility will be a lot easier/quicker.)

      He's on Windows (or else use digest::MD5; wouldn't have worked), and it was faster for me to type up the program than two figure out the dos command :)

      But if it were a case of looking for duplicates among a large set of files

      Indeed, but there's no evidence of that. That's why I asked and suggested an alternative.

        Actually, Digest::MD5 works on OS X also:
        $ perl -MDigest::Md5 -e '' $ perl -MDigest::MD5 -e '' $
        The weird mix of "sometimes case-sensitive, sometimes case-insensitive" in OS X is one of my bigger pet peeves about it in theory, but, in practice, it's only bitten me once (when I created two files in the same directory whose names differed only in case and I attempted to delete one, but ended up deleting both).

        Based on other responses by the OP, it's clear that he is using Windows, so your conclusion is correct, even if it's based on a data point which doesn't necessarily support it.

Re^2: MD 5 hash comparison/checker
by daggy (Novice) on May 07, 2010 at 02:55 UTC

    Hi, yeah it's used to compare the hashes.

    I tryed your code, but it wont let me specify which files I'd like compared.

    Also, I've noticed when I run code in perl, at the end of the code it automatically shuts down so I can't read the results, how do I stop this?

    It doesn't happen if I run from CMD, but if I just click the .pl file it shuts down at the end.

      but if I just click the .pl file it shuts down at the end.

      Ah. That explains a great deal. If you really want/expect the script to work when it gets launched by clicking on the file's icon in a file browser, consider the following idiom:

      #!/usr/bin/perl # (use a unix/linux style shebang line, # because someday you will want to use a unix/linux system) use strict; my $reqd_param_count = 2; # (e.g. two file names) if ( ! @ARGV ) { # prompt for interactive input of required parameter(s) ... } elsif ( @ARGV == $reqd_param_count ) { # invoked from an interactive shell: required params are in @ARGV ... } else { die "Usage: $0 arg1 ...\n"; }
      But seriously, there ought to be a sensible way to set things up so that a user can easily invoke a perl script with args (that will go into @ARGV). If not, just please switch to some sort of GUI approach (Tk, wx, etc), or else get cozy with using a CLI shell ("bash" is available for windows, and is the best, IMHO).
        How would I go about implementing that code?
      Command line tools are more useful when they accept file names from the command line.
      perl compare.pl file1 file2

      But if you prefer to prompt the user, feel free to adjust at will.

Re^2: MD 5 hash comparison/checker
by fullermd (Vicar) on May 07, 2010 at 22:38 UTC
    for (;;) { defined(read($fh1, my $buf1='', 4096)) or die $!; defined(read($fh2, my $buf2='', 4096)) or die $!; if ($buf1 ne $buf2) { print("Different\n"); exit(1); } last if !length($buf1); }

    Why not File::Compare?