File I/O: Flush or WriteThrough?
Join the DZone community and get the full member experience.Join For Free
Daniel Crabtree asked a really interesting question:
I've been playing around with file I/O and I am trying to figure out when to use FileOptions.WriteThrough. In your experience, if durability is priority 1 and performance priority 2, is there any reason to use WriteThrough on file streams in addition to or instead of FlushToDisk?
From my experiments, I found the following:
- Calls to Write block
- Slower than FlushToDisk
- Calls to Flush block
- Faster than WriteThrough
Both WriteThrough & FlushToDisk
- Calls to Write block
- Same performance as WriteThrough alone
I'm asking you, as I notice you've used both approaches.
You used WriteThrough:
- And in RavenDB https://github.com/ravendb/raven.munin/blob/master/Raven.Munin/FileBasedPersistentSource.cs
- You used Flush with flushToDisk in Rhino Events: https://github.com/ayende/Rhino.Events/blob/master/Rhino.Events/Storage/FileStreamSource.cs
Well… that caught up with me, it appears. Basically, I was a bit lazy with the terms I was using in those blog posts, so I think that I had better clarify.
WriteThrough is the .NET name for FILE_FLAG_WRITE_THROUGH which tells the OS to skip any caching and goes directly to the disk. Writes will block until the data is sent to the disk. That is usually the wrong thing to do, actually, since this means that you can’t get any benefit from OS buffers, batching, etc. In practice, what this means is that you are effectively calling fsync() after each and every write call. That is usually the wrong thing to do, since you would usually want to do multiple writes and then flush all those changes to disk, and you do want those changes to take advantage of the work the OS can do to optimize your system.
Instead, you want to use Flush(). Note that even in Munin, both options are used, and WriteThrough can be removed. Although Munin by no means should be seen as a good impl of a storage engine.
That said, you also have to be aware that Flush doesn’t always does its work: http://connect.microsoft.com/VisualStudio/feedback/details/792434/flush-true-does-not-always-flush-when-it-should
It looks like this is fixed in 4.5, but be aware if you need to run on 4.0 or previous versions.
Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Constructing Real-Time Analytics: Fundamental Components and Architectural Framework — Part 2
DevOps Midwest: A Community Event Full of DevSecOps Best Practices
Exploring the Capabilities of eBPF
What Is JHipster?