in reply to Re: Fast way to compare two picture-files
in thread Fast way to compare two picture-files

Why do people always suggest finding the digest to compare two files? Graciously assuming calculating the digest is instantaneous, using the digest method requires reading the entire file whereas one can usually exit after one pair of reads using a byte for byte comparison.

Using the digest method is great when comparing a file against multiple others over a long time period. It's a poor method when comparing two files.

If they match chances are they are the same image.

If you're going to follow up by checking the files byte for byte, it might make more sense to find the digest of the first X bytes of the file instead of the digest of the entire file.

  • Comment on Re^2: Fast way to compare two picture-files

Replies are listed 'Best First'.
Re^3: Fast way to compare two picture-files
by abell (Chaplain) on Dec 29, 2008 at 10:17 UTC
    It's out of similar considerations that some time ago I wrote dupseek. Even for multiple files, just using parts of file contents instead of digests works quite well (and rules out even the very small risk of collisions).


    The stupider the astronaut, the easier it is to win the trip to Vega - A. Tucket