Actually, if the data for a current run produces the same SHA1 hash string (or the same MD5 signature, if one prefers) as some other file that already exists, wouldn't that mean that the existing file already contains exactly the same data as the current run? In that case, you have a built-in means to decide when a new file doesn't need to be written to disk (because the data in question are already there).