in reply to Re^4: Synchronizing multiple processes retrieving files out of shared directory
in thread Synchronizing multiple processes retrieving files out of shared directory

I'm working on something very similar, and I can tell you for a fact that NFS complicates things considerably.

These resources were incredibly helpful:

Essentially, the first time you ask NFS for a file (via stat or open) it will check the filesystem. For the next 3 seconds (by default) , any stat call is checked against the attribute cache instead of the actual file. This means that even though you check for existence (-e $file), NFS can and will lie to you.

With a directory, the same thing is true - NFS has a filehandle cache. The first time you readdir on the directory, NFS caches that response for 3 seconds afterwards. So files can be deleted by other NFS clients and (again) NFS will lie to you.

From the Linux NFS faq http://nfs.sourceforge.net/:

After a file is deleted on the server, clients don't find out until they try to access the file with a file handle they had cached from a previous LOOKUP. Using rsync or mv to replace a file while it is in use on another client is a common scenario that results in an ESTALE error.

The NFS Coding how-to has suggestions on how to work around the NFS cache as a programmer. It also has a C utility that you can download and compile on two different servers that share the NFS storage so you can determine which workarounds work and which don't.

In general, open + close on the file and touching the directory *should* clear the caches on most Linux/Unix operating systems. VMS and Windows are a bit different though, so definitely utilize the resources above before they vanish into the dark recesses of the interwebs.

  • Comment on Re^5: Synchronizing multiple processes retrieving files out of shared directory

Replies are listed 'Best First'.
Re^6: Synchronizing multiple processes retrieving files out of shared directory
by BrowserUk (Patriarch) on Jun 04, 2015 at 15:58 UTC

    Doesn't surprise me. I'v seen similar problems with the MS implementation of the CIFS/SMB protocol.

    In theory atomic rename works; and on a good day (probably a Thursday in June), when no one else is looking, it does. But there are no guarantees.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
    In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked