comment on

I think I figured it out. Sometimes to processes open the file with the same filehandle, and they therefore share the lock because its tied to the filehandle.

No. open never returns "the same file handle" as in use by some other process, of course (they both could certainly get file descriptor 3 but that doesn't make the file handles related).

The biggest mistake in your code is using '>' in open my $FH, '>', $file or next;.

What is happening is that several processes notice the existence of random.tmp. One of them, "X", gets there first and overwrites¹ the file then locks it but then gets suspended while the other processes take turns trying to get some work done. A short time later, one of the other processes that noticed the existence of this file, "Y", finally gets a chance to run and overwrites the file before getting suspended. At this point X and Y each have a file handle open to the same file and X has a lock on it.

¹ I'm using "overwrite" here as short-hand for what opening with ">" does. But this doesn't create a new version of the existing file, it truncates the existing file. If the file doesn't exists, however, it will create a new version of it.

A short while later X runs a bit more and deletes the file and unlocks it. Then a third process, "Z", tries to open the same file that was just deleted but since ">" is used a new random.tmp is created and Z now has a handle open to it. So Y has a handle open to the original random.tmp that has just been deleted by X while Z has a handle open to the new random.tmp (and nobody has a lock). Since Y and Z have handles open to different files, they both manage to lock their handles about the same time and report this success.

Of course, during this same time, a whole bunch of other processes are overwriting either of those two versions of random.tmp and silently not reporting that they couldn't get a lock. In fact, it is probably one process that creates the second random.tmp then a fourth process manages to open it and get the lock, but that doesn't change any of your output.

The other mistake is thinking that waiting .1 seconds is nearly long enough to ensure that 500 (!) processes all have time to finish dealing with the file that they noticed. If you replaced ">" with "<+", then you'd likely get fewer reports of success locking that file and those reports would all be at least .1 seconds apart. Given that systems can get bogged down, it is usually best to have a 2-phase technique so that you can't be burned by a process taking much longer than you expected to get from step 1 to step 2. A third mistake is not waiting between deleting the file and unlocking it.

Since you unlink the file before you unlock it, you can prevent some of your race conditions by having the process that obtains a lock on a file check that the file that it has open hasn't been deleted since it got the lock. So, after you get the lock, instead of uselessly trying to get the lock a second time, stat the file by name and stat the file handle that you have open and verify that they refer to the same file (same inode number returned). This presumes that you don't have a malicious extra process doing "ln random.tmp save.tmp; sleep 1; ln save.tmp random.tmp", of course. But a new random.tmp being created (not a new link to the old random.tmp) would not cause any problems.

- tye

In reply to Re^2: flock LOCK_EX not locking exclusively (races) by tye
in thread flock LOCK_EX not locking exclusively by Anonymous Monk

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.