Acapulco has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks,

I've been reading PM for a while but this is my first time asking. Almost always the answer is already there and I just had to search for it... except in this case.

I've been reading about file locking in Perl, as I need to avoid concurrency issues while reading and writing a file.

My general problem is that I have a script that works as a CLI tool, but for certain operations I need to first check the contents of a file and depending on those I either X or Y things. Basically my file contains the state of a part of the system, and if that state is say A I can proceed, but if it's B I can't. The concurrency problem comes from the fact that the CLI can be called multiple times in succession (via scripts, etc) and so they should not have race conditions to check this file. Thus I thought of file locking.

However the issue I have is that everywhere I look I find flock being used but as far as I can tell, obtaining a lock this way is not atomic, since I first need to actually open the file.

Please correct me if I'm wrong, but if I first open then flock, isn't there a non-zero probability that this would cause a race condition?

Is there any way to actually do file locking in an atomic way, so that we open AND flock at the same time (or fail) to avoid this?

I've also looked into semaphores as an alternative, specifically IPC::Semaphores (since these would be shared between processes right?) but I am also forking a few times inside the script and thus the semaphore variable is going to be shared as well, and that would lead me to some difficulties since I would need to make sure the semaphores are not released inadvertently in the incorrect place (by a child for example).

Are my assumptions on flock correct or am I missing something here? if the race condition actually exists, how do people work around that? or do they just ignore it since for most use cases maybe this is a non-issue?

Most CPAN packages are out of the question because I can't really install anything due to policy issues :( and so I was looking for a built-in way to do this.

Thanks a lot for your help! Acapulco

Replies are listed 'Best First'.
Re: How to do atomic file locking?
by Corion (Patriarch) on Aug 25, 2015 at 17:53 UTC

    The trick is to ignore the open call in your sequence of locking. Two or more processes can (on Unixish operating systems) successfully call open() on the same file, but only one call to flock() will succeed. That should be enough for your program to know whether it is the instance that should proceed or not.

      Thanks! I was not aware of this.

      So now my question is, if two files get to open the file (I'm working on Linux 2.6 kernel so I guess this can happen) but only one gets the lock.

      Which call will fail? The flock one? can I do "flock or die"?

      Again, thanks a lot. I couldn't find any references to this behavior.

        Nevermind. It seems that indeed I can do flock or die.

Re: How to do atomic file locking?
by marinersk (Priest) on Aug 25, 2015 at 20:12 UTC

    Be careful to read all the implementation notes for flock. I used to use it to ensure atomicity all the time, but had run into a few circumstances where it worked fine under stress tests in one environment but would eventually fail in another.

    I never did fully isolate the problem to a coding error, but found refernces to implementation issues that matched my circumstances.

    I ultimately found other ways to achieve reliable atomicity and have had no need to go back. But oh, once implemented properly in an environment where it worked, flockwas graceful.

      Thanks for the advice. I am currently going over the docs.

      Out of curiosity, how did you resolve the atomicity issue? I mean, did you use some other Perl technique or did you move that part out of Perl altogether and used something else?

      What do you mean by flock being graceful? You mean if flock somehow crashed or misbehaved it didn't mess up things?

      Thanks again!

        I just realized I never answered your question about what I meant by "graceful".

        You ask Perl to get a lock on the file, and off you go. Cleanup is a matter of closing the file.

        No funky coding, no oddball exception handling, no having to overthink the problem. Ask it to do its job, and it does -- no muss, no fuss, no thirty lines of code. That, to me, is the epitome of grace.

        If memory serves, I coded up a primitive two-stage, file-based, self-help locking mechanism. Something akin to what follows. In this example, for file test.dat:

        1. Create a file (lock request token) based on the original filename
          • Include its creation timestamp
          • Include its modification timestamp
          • Include its server and unique PID
          • Make these sortable by creation, server, pid, modification
        2. Loop until it is the oldest lock request for this file
          • Re-create your lock request token file if it's been removed.
          • Update your lock request token modification timestamp
          • Delete requests older than the lock request timeout
          • sleep a short bit (don't burn the CPU)
        3. Create another file (lock bid token) based on the original filename
          • Include its creation timestamp
          • Include its modification timestamp
          • Include its server and unique PID
          • Make these sortable by creation, server, pid, modification
        4. Loop until it is the oldest lock bid for this file
          • Re-create your lock bid token file if it's been removed.
          • Update your lock bid token modification timestamp
          • Delete bids older than the lock bid timeout
          • sleep a short bit (don't burn the CPU)
        5. Unlocking consists of removing the lock request and lock bid files
        6. Long-running processes need to have a mechanism for updating the modification timestamp on the bid token to avoid it being deleted by others using the same self-help system.

        Then requests for lock simply queue up. This is not collision-proof, but in a low-volume environment was sufficiently collision-resistent for my needs (the math was favorable).

Re: How to do atomic file locking?
by mlawren (Sexton) on Aug 26, 2015 at 19:05 UTC

    An alternative to file locking is socket locking. You might find some inspiration in the source of my Lock::Socket module. Basically, a process cannot bind an already bound socket.

Re: How to do atomic file locking?
by anonymized user 468275 (Curate) on Aug 26, 2015 at 22:16 UTC
    On the other hand, the use of message queues is on the increase I believe, especially where there are unlimited front end clients. The advantage is that whereas front ends are asynchronous including the immediate server processing, the message queue then synchronises the requests as a first step to controlling the threads optimally, whereas a file lock forces all processes to wait, often unnecessarily. Typically for mobile apps, a device server forwards requests via a message queue to a transaction manager and waits for a response before displaying back to the device. Although both are daemons, the device server performs asynchronous multiprocessing that needs no concurrency control, whereas the transaction manager relying on the synchronisation from the message queue, (that's all it is for) can choose where to control concurrency using up to say 16 threads, e.g. let the DBMS do its own locking while on the other hand preventing concurrent update to a storable. This improves the response time as compared with making all clients wait for a single lock.

    One world, one people

Re: How to do atomic file locking?
by Anonymous Monk on Aug 25, 2015 at 17:48 UTC

    "Please correct me if I'm wrong, but if I first open then flock, isn't there a non-zero probability that this would cause a race condition?"

    You're wrong.

Re: How to do atomic file locking?
by locked_user sundialsvc4 (Abbot) on Aug 26, 2015 at 04:45 UTC

    It is also important to consider the operating-system and file-system.   Some scenarios provide only “advisory locking.”   Here is a good web-page that discusses the issue for Linux ... but Windows is different.   The flock perldoc page is mandatory reading.

    You also need to be sure that all disk writes are actually complete before you unlock.   This StackOverflow article seems like a good one to peruse.

    Remember that, if you have a more-complicated situation (and if you can be certain that no “unaware” programs might be accessing the same files at the same time), you can use a “lock file” ... a dummy file whose only purpose is to be the target of lock/unlock requests.

      Thanks for the advice. Yes, I'm aware this solution is only "advisory-based", but since I'm creating the lock file ex profeso for this and I'm sure this won't be residing in a network file system and other such nuances I believe it should work fine

      Regarding the file flushing, I was not aware of this but it definitely makes sense, otherwise we might end up with inconsistencies. Fortunately though, in my particular use case I usually open the shared file write something and close it immediately and as far as I can tell, close should flush.

      Again, thanks a lot for the advice. I like to be extra careful since I know concurrency is very easy to get wrong