I would think that getting the file size is faster than computing the hash for the file.
You shouldn't be doing either. It should have been done for free when the file was written.
If you didn't, you could compare files in a clever order and calculate their hash as they are being compared. This may save you from having to do more compares.
So it seems to me that pruning the list of files for which hashes have to be computed by comparing file sizes would be faster, especially for large numbers of files.
As the number of files grows, the number of collisions in file size grows.
Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
Want more info? How to link
or How to display code and escape characters
are good places to start.