Multi-Threaded Design Guidelines for Libraries
The major difference between libraries and frameworks is that a framework is something that runs your code, and is (in general) in control of its own environment. A library is something that you use in your own code, where you control the environment.
Examples for frameworks: ASP.Net, NServiceBus, WPF, etc.
Examples for libraries: NHibernate, RavenDB Client API, JSON.Net, SharpPDF, etc.
Why am I talking about the distinction between frameworks and libraries in a post about multi-threaded design?
Simple: there are vastly different rules for multi-threaded design with frameworks and libraries. In general, frameworks manage their own threads, and will let your code use one of their threads. On the other hands, libraries will use your own threads.
The simple rule for multi-threaded design for libraries? Just don’t do it.
Multi-threading is hard, and you are going to cause issues for people if you don’t know exactly what you are doing. Therefore, just write for a single threaded application and make sure to hold no shared state.
For example, JSON.Net pretty much does this. The sole place where it does do multi threading is where it is handling caching, and it must be doing this really well because I never paid it any mind and we got no error reports about it.
But the easiest thing to do is to just not support multi threading for your objects. If the user want to use the code from multiple threads, he is welcome to instantiate multiple instances and use one per thread.
How do you handle shared state?
The easiest way to handle that is to use the same approach that NHibernate and the RavenDB Client API uses. You have a factory / builder / fizzy object that you use to construct all of your state, this is done on a single thread, and then you call a method that effectively “freeze” this state from now on.
All future accesses to this state are read only. This is really good for doing things like reflection lookups, loading configuration, etc.
But what happens when you actually need shared mutable state? A common example is a cache, or global statistics. This is where you actually need to pull out your copy of Concurrent Programming on Windows and very carefully write true multi threaded code.
It is over a thousand pages, you say? Sure, and you need to know all of this crap to get multi threading working properly. Multi-threading is scary, hard and should not be used.
In general, even if you actually need to do shared mutable state, you really want to make sure that there are clear definitions between things that can be shared among multiple threads and the things that cannot. And you want to make most of the work in the parts where you don’t have to worry about multi threading.
It also means that your users have much easier time figuring out what the expected behavior of the system is. This is very important with the advent of C# 5.0, since async API are going to be a lot more common. Sure, you use the underlying async primitives, but did you consider what may happen when you are issuing multiple concurrent async requests. Is that allowed?
With C# 5.0, you can usually treat async code as if it was single threaded, but that breaks down if you are allowing multiple concurrent async operations.
In RavenDB and NHibernate, we use the notion of Document Store / Session Factory – which are created once, safe for multi threading and are usually singletons. And then we have the notion of sessions, which are single threaded, easy & cheap to create and follow the notion of one per thread (actually, one per work unit, but that is beside the point).
In my next post, I’ll discuss what happens when your library actually wants to go beyond just being safe for multi threading, when the library wants to use threading directly.