A suggestion to avoid possible collisions: if CPU time is not a concern, using several different algorithms to create multiple fingerprints increases the improbability of a collision to astronomically high figures. Even using the same algorithm on the original string and a variant created by some transliteration rules to obtain multiple fingerprints will exponentially decrease the probability of collisions.