Flock() over NFS

moseley has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: Flock() over NFS
by robartes (Priest) on Mar 14, 2003 at 07:18 UTC

Perl flock calls system flock(2) when it is available
NFS locking has to be done through fcntl(2), not through flock(2).
If the system does not have flock(2), Perl flock tries it's best to emulate it using whatever method is necessary, including fcntl(2).

You also know by now that locking over NFS is not very robust. It is essentially a rickety superstructure on top of almost non existing foundations, as NFS is designed to be stateless, which mixes with concepts such as locking as elegantly as a camel dances the Tango with an elephant. Beware of dragons.

CU
Robartes-

[reply]

Re: Flock() over NFS
by PodMaster (Abbot) on Mar 14, 2003 at 11:07 UTC

File::FlockDir

File::NFSLock

MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
I run a Win32 PPM repository for perl 5.6x+5.8x. I take requests.
** The Third rule of perl club is a statement of fact: pod is sexy.

[reply]

Re: Flock() over NFS
by bluto (Curate) on Mar 14, 2003 at 01:44 UTC

The problem is that flock may indeed work for the common case so your test won't necessarily report a problem. Keep in mind that even Sun, the infamous inventor of the NFS protocol, has had problems in the past where they swore they fixed the locking, but didn't.

There are other ways of locking over NFS, but these tend to be crufty and rely on knowing that certain operations are atomic in the NFS server.

[reply]

Re: Flock() over NFS
by perrin (Chancellor) on Mar 14, 2003 at 04:05 UTC

This has come up a few times before on Perlmonks and there was some valuable advice. I suggest you hit the search engine and read some words of wisdom about this from tilly and others.

[reply]

Re: Flock() over NFS
by hardburn (Abbot) on Mar 14, 2003 at 00:14 UTC

I'm not too familer with NFS, but have you checked from another machine that the file is locked? It could be that your local machine considers it locked, but the filesystem as a whole doesn't.

Also, I belive there are a few OSes that force locking for everyone (by making the file readable only by a random uid that the process runs under, IIRC), but Linux isn't one of them.

----
Reinvent a rounder wheel.

Note: All code is untested, unless otherwise stated

[reply]

Re: Flock() over NFS
by Anonymous Monk on Mar 14, 2003 at 12:37 UTC

Oh, and always, always, always check the return value of flock. If you ever try out a new platform, you won't remember to run a complex manual test. But the return value from flock will indicate that it cannot get a lock right away.

[reply]

Re: Re: Flock() over NFS

by moseley (Acolyte) on Mar 14, 2003 at 20:13 UTC

I did write a pair of scripts - IIRC they wrote log files of the number incremented in a file and then I checked if there were any duplicates reported between the two logs or any numbers out of sequence. All seemed as expected.

So either I was not testing it in a way to break it, or it is no longer an issue.

Perrin I did read tilly's articles (here's one) which states flock does not work for Linux. Since I can't make it not work I was looking for a script to show me that it fails.

Let's see if I still have the script... ah, here was my test script:


#!/usr/bin/perl -w
use strict;
use Fcntl qw(:DEFAULT :flock);
use Time::HiRes 'usleep';
use Devel::Peek;

open LOG, ">$$.log" or die $!;

while ( 1 ) {
   open LOCK, "lock.file" or die "lock file $!";
   die "$$ failed to get lock" unless flock(LOCK,LOCK_EX);

   # perlfaq example
   sysopen(FH, "numfile", O_RDWR|O_CREAT)  
        or die "can't open numfile: $!";
   my $num = <FH> || 0;
   chomp $num;
   seek(FH, 0, 0)  or die "can't rewind numfile: $!";
   truncate(FH, 0) or die "can't truncate numfile: $!";
   $num++;
   (print FH $num, "\n") or die "can't write numfile: $!";
   close FH              or die "can't close numfile: $!";

   print LOG "$num\t$$\n";

   close LOCK;
   usleep( 100 );
   last if $num >= 100000;
}
[download]

I ran about four or five processes at the same time and then merged and sorted and made sure there were no duplicates or missing numbers in the logs.

Thanks,

[reply]
[d/l]

Re: Flock() over NFS

by jamesw (Initiate) on Mar 14, 2003 at 22:37 UTC

My code to replicate the problem was much more aggressive than yours, spawning hundreds of child processes. I'd suggest modifying your code to take a command line argument and fork that many child processes. Run it with lots of children on both your NFS client and server hosts and see if you can get any failures that way.

It could be that your UNIX client and server have a working, compatible NFS locking implementation between them, but this test will not tell you if your code will work other clients or servers, so why not use a guaranteed to work method like atomically renaming files via link(), preferrably via a standard CPAN module?

[reply]

Re: Re: Re: Flock() over NFS

by Anonymous Monk on Mar 15, 2003 at 00:23 UTC

I am sure that it was not working for tilly when he wrote that. But Linux does not stand still.

[reply]