syswrite might be atomic, but that doesn't buy you anything, because open is still a separate step. Another process may have appended data to the file between your opening it and appending to it. Your file pointer will not advance to reflect the new end of the file, so you will overwrite that processes' data.
The only safe way to append to a file is to open it, obtain an exclusive lock, seek to its end, and then write. That seeking step is necessary exactly because of the above issue: between opening the file and getting a lock on it, someone else may have appended data to it.
What I said still stands: there is no way around locking.
Makeshifts last the longest.
| [reply] |
open is still a separate step. Another
process may have appended data to the file between your opening it and appending to it. Your file pointer will
not advance to reflect the new end of the file, so you will overwrite that processes' data.
From the Solaris write(2) manpage:
If the O_APPEND flag of the file status flags is set, the
file offset will be set to the end of the file prior to each
write and no intervening file modification operation will
occur between changing the file offset and the write operation.
This is for example how multiple apache processes all manage to append
to the same log file without explicit locking.
Dave.
| [reply] |
Interesting. man 2 open on FreeBSD says
Opening a file with O_APPEND set causes each write on the file to be appended to the end.
That is somewhat ambiguous, though it can be interpreted to mean the same as the Solaris manpage says about O_APPEND.
On Linux, man 2 open says
- O_APPEND
The file is opened in append mode. Before each write, the file pointer is positioned at the end of the file, as if with lseek. O_APPEND may lead to corrupted files on NFS file systems if more than one process appends data to a file at once. This is because NFS does not support appending to a file, so the client kernel has to simulate it, which can't be done without a race condition.
So this seems to be a portable assumption. (The cautionary note about files on NFS mounted filesystems found in the Linux manpage is probably applicable across the board, as well.) I wonder since when this has been the case; it hasn't always been.
Makeshifts last the longest.
| [reply] |
Open for append + syswrite seems to work perfectly on Win32. Running the following code with an initial arg of 10, sets 10 copies of the code all writing to the same file. When they complete, a quick sort shows that every line, from each process, is in the file and uncorrupted.
It seems that specifiying append mode, not only does an initial seek to the end, but also ensures that each time a write is done, a seek-to-end is done implicitly also. (Or maybe my testcase is crap?).
#! perl -slw
use strict;
## Decide a sync point if first instance
## or use the supplied sync time otherwise
my $time = $ARGV[ 1 ] || time() +2 ;
## Ensure "recursion" stops.
$ARGV[ 0 ]--;
## "Recursively" start asynchronous copies of ourself.
system qq[start cmd /c $0 $ARGV[ 0 ] $time ] if $ARGV[ 0 ] > 0;
## Wait until sync time to give other copies a chance to get going.
select undef, undef, undef, 0.1 until time() > $time;
for( 1 .. 1000 ) {
open FH, '>> :raw', 'data/append.tst' or die $!;
syswrite FH, "$$\t$_\n";
close FH;
## Slow things a little so that the don't all comeout together.
select undef, undef, undef, 0.001;
}
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
| [reply] [d/l] |
| [reply] |
Indeed, you are correct. That's a shame, it would have been useful to me.
Removing the delay and upping the line count to 10,000 eventually showed some corruption creeping in.
With regard to the binmode, I was under the impression that ':raw' was equivalent to it. Especially inconjuction with syswrite?
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
| [reply] |
I seem to remember the string had to be less than a certain length for this to be true. (1024, but that might be system dependant)
There are so many "if"s that this doesn't seem to reliable to me. I would at least give the user the option of using flock if I were to try to take advantage of this atomic property of syswrite.
Besides, flock will allow you to do longer operations on the log, such as shrinking in a nightly job.
| [reply] |