in reply to Re^2: Read in hostfile, modify, output
in thread Read in hostfile, modify, output

Geez, I see that my poorly explained comment about atomic operations generated some discussion.

To digress a bit, Wiki Linearizability contains a good discussion about what "atomic" means. I'll just quote part: Atomicity is a guarantee of isolation from concurrent processes. Additionally, atomic operations commonly have a succeed-or-fail definition, they either successfully change the state of the system, or have no apparent effect.

In the case of a single rename(), as long as the system software and hardware work as expected, a single rename can be considered atomic. There are ways (like system reset (hardware,software or human induced) or power supply transient) that can mess things up.

To replace an important file with a new version, first get the new file onto the system in its entirety. Now a quick two step dance happens. At a minimum, rename the existing file to something else, probably some sort of .bak. Then as soon as possible, rename the new file to what the existing file's name was. With everything working perfectly, there is a brief interval where "File" does not exist - that happens between the 2 renames. Any open() attempted during that very short interval will fail.

At the lowest level (disk writing hardware), a disk write operation is not "atomic" and can fail this test: either successfully change the state of the system, or have no apparent effect. When writing a sector on the HD, the sequence: a) Old data, b)No data(the write is occuring), c)New data occurs. If something goes wrong during step (b), then we are left with unknown garbage. Even the very fast rename function goes through these 3 steps.

Summary: Rename is the fastest and best function to use for replacing files. Get the new file completely ready in advance. Do not mess around with writing the new file line by line - that is too slow and opens up corruption issues.

Replies are listed 'Best First'.
Re^4: Read in hostfile, modify, output
by Anonymous Monk on Dec 20, 2016 at 23:14 UTC
    To replace an important file with a new version, first get the new file onto the system in its entirety. Now a quick two step dance happens. At a minimum, rename the existing file to something else

    That's not how you use rename() if you want it to be atomic, you just rename the new file over the old file. There is never a moment when the file doesn't exist. Second, you should read up on journaling filesystems, the chance for corruption is much lower and in many cases the filesystems can indeed guarantee that there will be no corruption even in the case of sudden power loss.

      I am not sure what OS and file system that you are using.
      Under Windows, this cannot happen. The rename will fail if the target file exists.
      C:\Projects_Perl\testing>echo "this is orginial file" > originalfile.t +xt C:\Projects_Perl\testing>echo "this the new file" > newfile.txt C:\Projects_Perl\testing>rename newfile.txt originalfile.txt A duplicate file name exists, or the file cannot be found. C:\Projects_Perl\testing>rename originalfile.txt originalfile.bak C:\Projects_Perl\testing>rename newfile.txt originalfile.txt C:\Projects_Perl\testing>type originalfile.txt "this the new file" C:\Projects_Perl\testing>
      This thread doesn't get into journaling filesystems. The vast majority of folks here are using standard versions of Windows or Unix variants.

      I am aware of the issues you describe, but we are getting into very specialized things with that discussion. I think launching an OS specific rename command with "override" options is also beyond the scope here.

        Under Windows, this cannot happen. The rename will fail if the target file exists.
        C:\Projects_Perl\testing>rename newfile.txt originalfile.txt
        A duplicate file name exists, or the file cannot be found.
        It can happen. Use the Windows MOVE command instead of RENAME:
        MOVE /Y newfile.txt originalfile.txt
        Or use Perl:
        perl -e "rename('newfile.txt', 'originalfile.txt')"

        Atomic rename is specified by POSIX, nothing "specialized" about it. Yes, Windows is more complicated and you have a point in that case, but newer versions of Windows provide some advanced API functions.

        Most modern filesystems, like ext3, ext4, NTFS, and HFS+ support journaling, so "The vast majority of folks here" with their "standard versions of Windows or Unix variants" are already using them. Do you still think "There is no truly atomic operation on the file system"?

        Under Windows, this cannot happen. The rename will fail if the target file exists.

        What? Works for me (it's just not atomic, as far as I can tell).

        echo foo >foo echo bar >bar perl -e "rename(foo,bar)"