If you are on a Linux/UNIX system you could always look into using hard links (if all on the same partition) or dynamic links. Then you can have multiple references to the same file, rather than writing a lot of special code to track down files by their CRC... I could be misunderstanding the question, though.