I’ve been writing non-blocking, asynchronous code for the past year. Learning how it works and how to write it is not hard. Where are the benefits coming from is what I don’t understand. Moreover, there is so much hype surrounding some programming models, that you have to be pretty good at telling marketing from rumours from facts.
So let’s first start with clarifying the terms. Non-blocking applications are written in a way that threads never block – whenever a thread would have to block on I/O (e.g. reading/writing from/to a socket), it instead gets notified when new data is available. How is that implemented is out of the scope of this post. Non-blocking applications are normally implemented with message passing (or events). “Asynchronous” is related to that (in fact, in many cases it’s a synonym for “non-blocking”), as you send your request events and then get response to them in a different thread, at a different time – asynchronously. And then there’s the “reactive” buzzword, which I honestly can’t explain – on one hand there’s the reactive functional programming, which is rather abstract; on the other hand there’s the reactive manifesto which defines 3 requirements for practically every application out there (responsive, elastic, resilient) and one implementation detail (message-driven), which is there for no apparent reason. And how does the whole thing relate to non-blocking/asynchronous programming – probably because of the message-driven thing, but often the three go together in the buzzword-driven marketing jargon.
Two examples of frameworks/tools that are used to implement non-blocking (web) applications are Akka (for Scala nad Java) and Node.js. I’ve been using the former, but most of the things are relevant to Node as well.
Here’s a rather simplified description of how it works. It uses the reactor pattern (ahaa, maybe that’s where “reactive” comes from?) where one thread serves all requests by multiplexing between tasks and never blocks anywhere – whenever something is ready, it gets processed by that thread (or a couple of threads). So, if two requests are made to a web app that reads from the database and writes the response, the framework reads the input from each socket (by getting notified on incoming data, switching between the two sockets), and when it has read everything, passes a “here’s the request” message to the application code. The application code then sends a message to a database access layer, which in turn sends a message to the database (driver), and gets notified whenever reading the data from the database is complete. In the callback it in turn sends a message to the frontend/controller, which in turn writes the data as response, by sending it as message(s). Everything consists of a lot of message passing and possibly callbacks.
One problem of that setup is that if at any point in the code the thread blocks, then the whole things goes to hell. But let’s assume all your code and 3rd party libraries are non-blocking and/or you have some clever way to avoid blocking everything (e.g. an internal thread pool that handles the blocking part).
That brings me to another point – whether only reading and writing the socket is non-blocking as opposed to the whole application being non-blocking. For example, Tomcat’s NIO connector is non-blocking, but (afaik, via a thread pool) the application code can be executed in the “good old” synchronous way. Though I admit I don’t fully understand that part, we have to distinguish asynchronous application code from asynchronous I/O, provided by the infrastructure.
And another important distinction – the fact that your server code is non-blocking/asynchronous, doesn’t mean your application is asynchronous to the client. The two things are related, but not the same – if your client uses long-living connection where it expects new data to be pushed from the server (e.g. websockets/comet) then the asynchronicity goes outside your code and becomes a feature of your application, from the perspective of the client. And that can be achieved in multiple ways, including Java Servlet with async=true (that is using a non-blocking model so that long-living connections do not each hold a blocked thread).
Okay, now we know roughly how it works, and we can even write code in that paradigm. We can pass messages around, write callbacks, or get notified with a different message (i.e. akka’s “ask” vs “tell” pattern). But again – what’s the point?
That’s where it gets tricky. You can experiment with googling for stuff like “benefits of non-blocking/NIO”, benchmarks, “what is faster – blocking or non-blocking”, etc. People will say non-blocking is faster, or more scalable, that it requires less memory for threads, has higher throughput, or any combination of these. Are they true? Nobody knows. It indeed makes sense that by not blocking your threads, and when you don’t have a thread-per-socket, you can have less threads service more requests. But is that faster or more memory efficient? Do you reach the maximum number of threads in a big thread pool before you max the CPU, network I/O or disk I/O? Is the bottleneck in a regular web application really the thread pool? I couldn’t find a definitive answer.
This benchmark shows raw servlets are faster than Node (and when spray (akka) was present in that benechmark, it was also slower). This one shows that the NIO tomcat connector gives worse throughput. My own benchmark (which I lost) of spray vs spring-mvc showed that spray started returning 500 (Internal Server Error) responses with way less concurrent requests than spring-mvc. I would bet there are counter-benchmarks that “prove” otherwise.
The most comprehensive piece on the topic is the “Thousands of Threads and Blocking I/O” presentation from 2008, which says something I myself felt – that everyone “knows” non-blocking is better and faster, but nobody actually tested it, and that people sometimes confuse “fast” and “scalable”. And that blocking servers actually perform ~20 faster. That presentation, complemented by this “Avoid NIO” post, claim that the non-blocking approach is actually worse in terms of scalability and performance. And this paper (from 2003) claims that “Events Are A Bad Idea (for high-concurrency servers)”. But is all this objective, does it hold true only for the Java NIO library or for the non-blocking approach in general; does it apply to Node.js and akka/spray, and how do applications that are asynchronous from the client perspective fit into the picture – I honestly don’t know.
It feels like the old, thread-pool-based, blocking approach is at least good enough, if not better. Despite the “common knowledge” that it is not.
And to complicate things even further, let’s consider usecases. Maybe you should use a blocking approach for a RESTful API with a traditional request/response paradigm, but maybe you should make a high-speed trading web application non-blocking, because of the asynchronous nature. Should you have only your “connector” (in tomcat terms) nonblocking, and the rest of your application blocking…except for the asynchronous (from client perspective) part? It gets really complicated to answer.
And even “it depends” is not a good-enough answer. Some people would say that you should to your own benchmark, for your usecase. But for a benchmark you need an actual application. Written in all possible ways. Yes, you can use some prototype, basic functionality, but choosing the programming paradigm must happen very early (and it’s hard to refactor it later). So, which approach is more performant, scalable, memory-efficient? I don’t know.
What do I know, however, is which is easier to program, easier to test and easier to support. And that’s the blocking paradigm. Where you simple call methods on objects, not caring about callbacks and handling responses. Synchronous, simple, straightforward. This is actually one of the points in both the presentation and the paper I linked above – that it’s harder to write non-blocking code. And given the unclear benefits (if any), I would say that programming, testing and supporting the code is the main distinguishing feature. Whether you are going to be able to serve 10000 or 11000 concurrent users from a single machine doesn’t really matter. Hardware is cheap. (unless it’s 1000 vs 10000, of course).
But why is the non-blocking, asynchronous, event/message-driven programming paradigm harder? For me, at least, even after a year of writing in that paradigm, it’s still messier. First, it is way harder to trace the programming flow. With a synchronous code you would just tell your IDE to fetch the call hierarchy (or find the usage of a given method if your language is not IDE-friendly), and see where everything comes and goes. With events it’s not that trivial. Who constructs this message? Where is it sent to / who consumes it? How is the response obtained – via callback, via another message? When is the response message constructed and who actually consumes it? And no, that’s not “loose coupling”, because your code is still pretty logically (and compilation-wise) coupled, it’s just harder to read.
What about thread-safety – the event passing allegedly ensure that no contention, deadlocks, or race-conditions occur. Well, even that’s not necessarily true. You have to be very careful with callbacks (unless you really have one thread like in Node) and your “actor” state. Which piece of code is executed by which thread is important (in akka at least), and you can still have a shared state even though only a few threads do the work. And for the synchronous approach you just have to follow one simple rule – state does not belong in the code, period. No instance variables and you are safe, regardless of how many threads execute the same piece of code. The presentation above mentions also immutable and concurrent data structures that are inherently thread-safe and can be used in either of the paradigms. So in terms of concurrency, it’s pretty easy, from the perspective of the developer.
Testing complicated message-passing flows is a nightmare, really. And whereas test code is generally less readable than the production code, test code for a non-blocking application is, in my experience, much uglier. But that’s subjective again, I agree.
I wouldn’t like to finish this long and unfocused piece with “it depends”. I really think the synchronous/blocking programming model, with a thread pool and no message passing in the business logic is the simpler and more straightforward way to write code. And if, as pointed out by the presentation and paper linked about, it’s also faster – great. And when you really need asynchronously sending responses to clients – consider the non-blocking approach only for that part of the functionality. Ultimately, given similar performance, throughput, scalability (and ignoring the marketing buzz), I think one should choose the programming paradigm that is easier to write, read and test. Because it takes 30 minutes to start another server, but accidental complexity can burn weeks and months of programming effort. For me, the blocking/synchronous approach is the easier to write, read and test, but that isn’t necessarily universal. I would just not base my choice of a programming paradigm on vague claims about performance.