It's out of similar considerations that some time ago I wrote dupseek. Even for multiple files, just using parts of file contents instead of digests works quite well (and rules out even the very small risk of collisions).
The stupider the astronaut, the easier it is to win the trip to Vega - A. Tucket