Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: fsyncing directories

by ig (Vicar)
on Apr 27, 2010 at 17:07 UTC ( [id://837146]=note: print w/replies, xml ) Need Help??


in reply to fsyncing directories

I wonder what your objective is. For a thought provoking discussion (not all happy thoughts, unfortunately) you might have a look at https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/317781 - you might skim through to entry 45 by Theodore Ts'o. In short, not even sync, much less fsync, is adequate for ensuring your data is written to disk and reliably retrievable, though both/either might reduce your risk.

Replies are listed 'Best First'.
Re^2: fsyncing directories
by betterworld (Curate) on Apr 27, 2010 at 18:42 UTC

    I had that particular bug tracker item open in my browser earlier today :)

    However it is quite long and I did not read everything in that discussion. The topic is the difference between ext3 and ext4, particularly their features data=ordered and delayed allocation. Where does it say that not even sync, much less fsync, is adequate for ensuring your data is written to disk and reliably retrievable? The entry that you reference (45) states quite the opposite imho. On ext4 (especially when replacing files on pre-linux-2.6.30 or something like that) you must fsync() your files at the appropriate times to prevent a high risk of data loss. On ext3 it is (a bit) helpful too, though it might slow things down.

    I wonder what your objective is.

    I was just reading that passage in the man page (which I quoted in my original post) and was wondering how to do that in Perl.

    Update: That entry 45 has a lot of common text with this blogpost, which I found quite enlightening.

      Where does it say that not even sync, much less fsync, is adequate for ensuring your data is written to disk and reliably retrievable?

      Sorry - I read that thread a long time ago and I summarized my recollection of a lot of reading stimulated by it. In comment #56 Theodore says, of his recommended best practice for syncing from the application: It is not fool-proof, but then again ext3 was never fool-proof, either. But he doesn't say much about what the remaining risks are.

      A point that was well made elsewhere (e.g. http://www.hitachigst.com/hdd/technolo/writecache/writecache.htm (see the first paragraph under "What is write caching?") and http://old.nabble.com/ext4-finally-doing-the-right-thing-td27186399.html) is that not only the operating/file system buffers data and potentially reorders operations - many disk drives have intelligent controllers that also buffer data and reorder writes, and the disk controllers don't necessarily write their data when you issue a sync at the operating/file system level.

      Risk of data loss or corruption can be reduced by calling the sync functions from the application at critical points - I don't mean to discourage doing so - it can be good practice. But don't get carried away or system performance may be adversely impacted.

      In addition, some file system operations create more risk than others, and consideration should be given to how the file system is used, as well as when it is synchronized to disk.

      Finally, for a good balance of reliability and performance, there are many configuration settings that should be considered, in the operating system, file system, RAID and virtual disk system (be they encrypting, compressing or whatever) and in the disk drives and interfaces.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://837146]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (2)
As of 2024-04-26 05:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found