http://qs1969.pair.com?node_id=410790


in reply to Ways to sequence calls from multiple processes

Your program will likely work in 99% of the cases (maybe 99.9999%), but as the opening and locking are two separate steps, it might happen that:
  1. Process A opens lock.txt
  2. Process B opens lock.txt
  3. Process A flocks()
  4. Process B tries to flock().... and fails
So you should really encapsulate your code above in a polling loop that detects locking failures and retries every n seconds on failure.

If you want to be more safe, you should ask yourself: what happens if now_access_the_common_resource() crashes or loops? will the file stay locked forever? how do you detect that the other processes are starving?

My suggestion is: if your program already uses a RDBMS, why not using its own locking features? each and every decent RDBMS already has consistent locking features implemented to make its own basic work.

You could - for example - have a one-row table with a status field that each process tries to update like UPDATE mytable SET status=1 WHERE status=0 LIMIT 1. After each update you check the number of affected rows; it it's 1 you got access to the resource (and atomically locked it), if no row was affected you could not obtain the access, so you sleep() random seconds and try again. To release the resource you simply UPDATE mytable SET status = 0 WHERE table_id = 123.

The nice part of this approach is that it works not only for one resource but for limited sets like resource pools too: just add ten lines to the table, and the first ten processes will be allowed to pass, the eleventh will have to wait and so on.

  • Comment on Re: Ways to sequence calls from multiple processes

Replies are listed 'Best First'.
Re^2: Ways to sequence calls from multiple processes
by bart (Canon) on Nov 28, 2004 at 11:20 UTC
    Your program will likely work in 99% of the cases (maybe 99.9999%), but as the opening and locking are two separate steps, it might happen that:
    1. Process A opens lock.txt
    2. Process B opens lock.txt
    3. Process A flocks()
    4. Process B tries to flock().... and fails
    So you should really encapsulate your code above in a polling loop that detects locking failures and retries every n seconds on failure.
    Eh... what?!?!? There are several things wrong with what you say here.
    1. Opening a file will always work whether it is locked or not. Actually, the fact whether it's may be locked is irrelevant.
    2. Nobody specified use of LOCK_NB — instead he's using LOCK_EX (2), so flock will simply wait if the file is locked, until it can get a lock. No looping necessary. flock will only return after a failure if something is seriously wrong. Your program shouldn't loop then, but die.
    There simply is no race condition.

    If you want to be more safe, you should ask yourself: what happens if now_access_the_common_resource() crashes or loops? will the file stay locked forever? how do you detect that the other processes are starving?

    Hold that thought. Well, one surely nice thing by OS-native file locking, is that a flock is automatically released if a process ends without releasing the lock itself, or closing the file. That includes crashes and being killed by other processes. There simply is no problem.

    A problem could indeed be looping forever without releasing the lock... so don't do that.

    My suggestion is: if your program already uses a RDBMS, why not using its own locking features? each and every decent RDBMS already has consistent locking features implemented to make its own basic work.

    You could - for example - have a one-row table with a status field that each process tries to update like UPDATE mytable SET status=1 WHERE status=0 LIMIT 1. After each update you check the number of affected rows; it it's 1 you got access to the resource (and atomically locked it), if no row was affected you could not obtain the access, so you sleep() random seconds and try again. To release the resource you simply UPDATE mytable SET status = 0 WHERE table_id = 123.

    That approach is simply too attrocious for words. Your solution clearly suffers of the problem you just warned against: what if the program that holds the lock, dies? The status field of that database row will simply never get cleared.

    And you're actually not using "the locking features" of the database, you're using the fact that the update of the database is atomic. I'm not even 100% convinced that it is. You are using the return value of the count of affected rows for a purpose that it was never intended for.

    There even are ways to try create a file if it doesn't exist, atomically. Look up O_EXCL in perlopentut. That would seem to be a far simpler, and more reliable way to do it. And even that seems to me like you're working too hard, because you still have to guard against premature death of your process, and take measures that the file will indeed get deleted in due time.

    Just stick with locking.

      Well, one surely nice thing by OS-native file locking, is that a flock is automatically released if a process ends without releasing the lock itself, or closing the file. That includes crashes and being killed by other processes. There simply is no problem.

      I quite like the idea of all subsequent access being blocked if there was a failure. It brings the failure to the admin's attention so he can investigate why it failed in the first place.

      I frankly don't know much about unix locking, and you are right, but have written concurrent production code in major apps for major players. What I was saying is that you don't need table locking on the database (that has its own uses), you just want to have a mutex access to a pool of resources, and the database does that excellently. And if you add a datetime field that you happen to update with the update, you can know who locked what and easily inspect it externally. And this is definitely a plus when running concurrent stuff, where bugs are rather hard to detect.
Re^2: Ways to sequence calls from multiple processes
by DrWhy (Chaplain) on Nov 28, 2004 at 14:19 UTC
    You could - for example - have a one-row table with a status field that each process tries to update like UPDATE mytable SET status=1 WHERE status=0 LIMIT 1. After each update you check the number of affected rows; it it's 1 you got access to the resource (and atomically locked it), if no row was affected you could not obtain the access, so you sleep() random seconds and try again. To release the resource you simply UPDATE mytable SET status = 0 WHERE table_id = 123
    bart's criticisms of this approach are well taken, however, there are ways to take advantage of database locking to solve your problem. In Oracle, for example, you could have the same table described, but do a SELECT FOR UPDATE mytable or even just LOCK mytable. Then you would do your work on your resource. When you are finished, do a ROLLBACK or an UNLOCK mytable. Most other RDBMS's will have similar, but not necessarily identical language to effect the same behavior. This has the advantage that if your code fails unexpectedly, the database will remove the lock for you. You have to be careful if the resource (or one of the resources) you access is the database itself; in that case doing the rollback will not have the results you want ;)

    All that said, the file locking is still the simpler and more straightforward solution in most (99.9999%) cases.

    --DrWhy

    "If God had meant for us to think for ourselves he would have given us brains. Oh, wait..."

      If you want a lock in oracle, you're going about it the entirely wrong (though viable of course) way. Use dbms_lock.request example
      v_lock_sucess:=dbms_lock.request(lockhandle=>v_temp_lockhandle, lockmode=>dbms_lock.x_mode, TIMEOUT=>5, release_on_commit=>TRUE);
Re^2: Ways to sequence calls from multiple processes
by ikegami (Patriarch) on Nov 29, 2004 at 06:08 UTC
    If you want to be more safe, you should ask yourself: what happens if now_access_the_common_resource() crashes or loops? will the file stay locked forever? how do you detect that the other processes are starving?

    You seem to have a poor understanding of how flock (and database locking) works. In addition to flock blocking until the lock succeeds (as bart explained), crashing is not a problem. When the application crashes, the filehandle will get reclaimed by the OS. The lock gets released along with the filehandle.

    You're right that an application can hold the lock indefinitely. For example, the application could enter an infinite loop, or do a blocking call. Switching to a databse (especially in the manner you describe) will not magically solve these problems. Welcome to the very complex world of race conditions and fault tolerant computing.