in reply to Replace duplicate files with hardlinks

Nice script :) The error-checking looks quite robust. However, may I suggest a change to the order of the system calls:

Currently, you move one of the dupes to a temporary location, then link the original file to the old location. I'd suggest first link()ing the original to a temporary location, then rename()ing that temporary name to the location of the dupe (rename atomically overwrites its destination if it exists).

The advantage would be that there is no time window where any file is not accessible by its original filename. If the script crashes (like between rename and link), the worst thing that can happen is that it leaves an unneeded temporary file. This might sound somewhat academic, but I like to avoid as much race conditions as possible in the scripts that I write :)

  • Comment on Re: Replace duplicate files with hardlinks

Replies are listed 'Best First'.
Re^2: Replace duplicate files with hardlinks
by bruno (Friar) on Aug 10, 2008 at 21:45 UTC
    Thanks for the comment! I agree with you; your approach makes the script more secure, and it doesn't require any extra complexity. I'll change it as you suggest.