http://qs1969.pair.com?node_id=803808


in reply to Re^6: Assistance with file compare
in thread Assistance with file compare

I would think that getting the file size is faster than computing the hash for the file. So it seems to me that pruning the list of files for which hashes have to be computed by comparing file sizes would be faster, especially for large numbers of files.

I am curious to know why your second method is better for many files. Could you enlighten me please?

Replies are listed 'Best First'.
Re^8: Assistance with file compare
by ikegami (Patriarch) on Oct 28, 2009 at 21:38 UTC

    I would think that getting the file size is faster than computing the hash for the file.

    You shouldn't be doing either. It should have been done for free when the file was written.

    If you didn't, you could compare files in a clever order and calculate their hash as they are being compared. This may save you from having to do more compares.

    So it seems to me that pruning the list of files for which hashes have to be computed by comparing file sizes would be faster, especially for large numbers of files.

    As the number of files grows, the number of collisions in file size grows.