in reply to Apache and file generation - flock?

It is bad karma to be writing a file that other processes are reading - you might get people who get part of the old file, and part of the new, and they probably wouldn't like that...

A better idead would be to write to a temporary file, and then, once the write is finished, rename the file to the name that the other processes use.

On most (all?) systems, those processes that have opened the old file will stil get to read the old file until they close it. But the processes that open the file after the rename, will get the new data. Best of all, you will have a guarantee that nobody will be reading a partialy-written file.

  • Comment on Re: Apache and file generation - flock?

Replies are listed 'Best First'.
Re: Re: Apache and file generation - flock?
by Roger (Parson) on Mar 01, 2004 at 12:33 UTC
    Ok, one process writes to a temporary file, and when it has finished writing, renames the file. At the sametime, another process writes to it's own copy of the temporary file, and when it finished, renames the file... The second process will clobber what's been written by the first process.

    To have many processes successfully writing to the same file, you would need some sort of locking mechanism in place to protect critical parts of the program. And the locking can be implemented using IPC::Semaphore on *nix and Win32::Semaphore on windows.

    Now, combine the semaphore with the temporary file, and the new algorithm looks like:
    process start if read then obtain shared read lock on the original text file read the data release shared lock elsif write then obtain exclusive write lock on the temporary file write the data obtain exclusive write lock on the original file overwrite the original file with new data release lock on original file release lock on temporary file end if

    This algorithm penalizes the writer, assuming that there are more readers than writers, and that writing is a lengthy process. The benefit is that the data is kept consistant with locking, also the readers can still read the original file while the new file is being written.

    Another variant is to omit the temporary file and let the writing process lock the original text file, but that imposes penalty on reading processes where all readers must wait for the writer to finish, if the writing of data is an expensive process.

    If you want to minimize the penalties to both the reader and the writer, then you need some more elaborate caching and locking algorithm, which is probably out of scope.