in reply to Re^2: Simple Digests?
in thread Simple Digests?

Ah. That's a tough one. Digests can help, but since two very different sets of data can share a digest... well, it won't be unique. You could do something like a digest prepended with the nth 10 bytes from the file, or something, but even that wouldn't be a guarantee.

It comes down to finding something that's unique enough about the data that when it's combined with a digest, you have a "pretty much guaranteed" unique key. MIME type, maybe? Combined with a longer digest (say, SHA-256), that would be pretty good.

Two different digests (say, MD5().SHA256()) would be a likely candidate, too -- the chance that data "A" will have a digest collision with data "B" in two different digest systems is fanstastically small.

<-radiant.matrix->
A collection of thoughts and links from the minds of geeks
The Code that can be seen is not the true Code
I haven't found a problem yet that can't be solved by a well-placed trebuchet

Replies are listed 'Best First'.
Re^4: Simple Digests?
by pileofrogs (Priest) on Mar 31, 2006 at 02:20 UTC

    I really don't need that much protection from collisions. It's not like I need to worry about someone intentionally trying to create a collision. Instead of a 1 in a Zillion odds of a collision, I probably could get by with 1 in a thousand.