Re: Best practices for modifying a file in place: q's about opening files, file locking, and using the rename function

Q1: In "The truly paranoid programmer would lock the file", which file are the authors referring to?
The $old file. This is data that would get clobbered (assuming you are not using the same name for the temp $new file). BTW I would name the files $orig and $tmp - that seems to make more sense.

Q2: Regarding the reason for being "truly paranoid" -- is this because we don't want another running instance of this script to be writing to $new while we are,
Nope, You're stuck on the $new temp file when the $old original file is what you should be concerned about. You should be using File::Temp to get a uniquely named $new temp file.

I'm not sure I completely understand the hazards of "clobbering." So...
It's when this happens:

UserA                    UserB                    Orig File
Open $orig                                        Original Content
                                                          |
Reads $orig              Opens $orig                      |
                                                          |
Modify $orig in Memory   Reads $orig                      |
                                                          |
Write $orig to FS        Modify $orig in Memory     UserA Content
                                                          |
                         Write $orig to FS          UserB Content
[download]

There are 2 problems - UserA's changes only last a split second but generally the more important problem is UserB never saw changes UserA made.

Q3: Is the problem the fact that $new might exist already because another instance of this script running at the same time had created $new a split-second ago in connection with its own update of $old, and that our process will destroy the contents of that $new due to the way ">" works,
Nope (at least if you use File::Temp). You only have to be concerned about the file has the unchanging name. That is when 'clobbering' occurs.

Q4: In a multi-user environment, does a careful programmer need to use "sysopen/flock LOCK_EX/truncate" every time a script needs to write a file? And now a final wrinkle on the addition of a file lock for $new in the recipe.
Depends.

If it's really important then yes, you should.
If it's not critical and not changed very often, locking is not that critical.
If you are reasonably sure that only one instance of one program will be updating the file. The locking is generally not required.

The flip side is - If your data is important, changed by more than one source, and changed often - Then you should generally use a full database that supports locking. This is why file locking is not a huge problem.

Q5: Wouldn't we would want to keep $new open (and hence the LOCK_EX in place) until after the "rename( $new, $old )"?
You're still stuck on $new but, I'll rework your question towards what I think you want to ask. 'When should I be releasing a lock'

The best strategy IMO is to create a '.lock' file and flock that. Like this:

Once your program decides to modify the file 'foo.txt'. Check for a flocked 'foo.lock' file. If you're clean then create a 'foo.lock' and lock it.
read 'foo.txt'
modify
write it to a unique temp file via File::Temp
rename temp file to 'foo.txt'
delete 'foo.lock'

This prevents corruption from clobbering and from your program dieing in mid write.

grep

One dead unjugged rabbit fish later

Comment on Re: Best practices for modifying a file in place: q's about opening files, file locking, and using the rename function Download Code