Understanding Spring Reactive: Servlet Async

DZone 's Guide to

Understanding Spring Reactive: Servlet Async

Want to learn more about using Spring Reactive? Check out this post on the Servlet Async feature and the push towards a non-blocking, reactive world.

· Java Zone ·
Free Resource

In the previous article, we discussed how Servlet containers have evolved and turned communication from the Client to the Server into a non-blocking paradigm. In this article, we will be focusing on the evolution of Java Servlets (and Spring) towards a non-blocking, reactive world.

Let’s recall the flow request when received by the NIO connector:

  1. Few threads (1-4 depending on the number of cores) polling the selector, looking for IO activity on a channel on a connection.
  2. When the selector sees the IO activity, it calls a handle method on the connection and a thread from pool is allocated to process.
  3.  Thread will attempt to read the connection and parse it and for HTTP connection. If the request headers are complete, the thread goes on to call the handling of the request (eventually this gets to the servlet) without waiting for any content.
  4. Once a thread is dispatched to a Servlet, it looks to it like the Servlet IO is blocking, and hence, any attempt to read/write data from the  HttpInputStream HttpOutputStream should block. But, as we are using the NIO connector underneath, the IO operations using HttpInputStream and HttpOutputStream are async with callbacks. Due to the blocking nature of the Servlet API, it uses a special blocking callback to achieve blocking.

Step 4 above would have clarified more on 'Simulated Blocking' term used in the previous article.

Challenges Prior to Servlet 3.0

Now, coming back to the challenges posed by one thread per request model, we can see that actual request processing, which is blocking in nature, is done by a thread (we will call it request thread) from the pool, which is managed by the Servlet container. In NIO, the default thread pool size is 200, which implies that only 200 requests can be served concurrently. The problem with the synchronous processing of requests is that it resulted in threads (doing heavy-lifting) running for a long time before the response goes out. If this happens at scale, the Servlet container eventually runs out of threads — long-running threads lead to thread starvation.

This size could be increased to suit any number (with hardware constraints), but then it will also bring the overhead of context switching, cache flush, etc. While increasing threads and serving more concurrent requests is not a bad idea, in case the application requires high concurrency, then we need to find some other suitable approach. Let’s read on to better understand the approach of handling more concurrent users without increasing the container thread pool size.

The Server thread is blocked during Http Request Processing

Async Servlets in 3.0

An async Servlet enables an application to process incoming requests in an asynchronous fashion. A given HTTP request thread handles an incoming request and then passes the request to another background thread, which, in turn, will be responsible for processing the request and sends the response back to the client. The initial HTTP request thread will return to the HTTP thread pool as soon as it passes the request to the background thread, so it becomes available to process another request.

Server thread is released during Http Request Processing

Below is a code snippet on how this can be achieved in Servlet 3.0:

@WebServlet(name="myServlet", urlPatterns={"/asyncprocess"}, asyncSupported=true)
public class MyServlet extends HttpServlet {
    public void doGet(HttpServletRequest request, HttpServletResponse response) {
        OutputStream out = response.getOutputStream();
        AsyncContext aCtx = request.startAsync(request, response);
        //process your request in a different thread
        Runnable runnable = new Runnable() {
            public void run() {
                String json ="json string";
        //use some thread pool executor

When the asyncSupported attribute is set to true, the response object is not committed on method exit. Calling startAsync() returns an AsyncContext object that caches the request/response object pair. The AsyncContext object is then stored in an application-scoped queue. Without any delay, the doGet() method returns, and the original request thread is recycled. We can configure a Thread Pool Execotor on server startup, which will be used to process the request. After a request is processed, you have the option of calling  HttpServletResponse.getOutputStream().write(...), and then complete() to commit the response or call forward() to direct the flow to a JSP page to be displayed as the result. Note that JSP pages are servlets with an asyncSupported attribute that defaults to false. complete() triggers the Servlet container to return the response to the client.

Note: This whole behavior is defined above for Servlets that can be achieved by returning callable, DeferredResult or CompletableFuture from Spring Controller.

This approach by itself may solve the problem of HTTP thread pool exhaustion, but it will not solve the problem of system resources consumption. After all, another background thread was created for processing the request, so the number of simultaneously active threads will not decrease and the system resource consumption will not be improved. So, one might think that this could not be a better evolution on the existing stack. Let’s first discuss its implementation in Spring and then will try to figure out which scenarios this is best and scores big on the synchronous servlets.

We will be using a Spring Boot project to expose two endpoints — one blockingRequestProcessing and another asyncBlockingRequestProcessing using the async servlet feature.

 @GetMapping(value = "/blockingRequestProcessing")

    public String blockingRequestProcessing() {

        logger.debug("Blocking Request processing Triggered");

        String url = "http://localhost:8090/sleep/1000";

        new RestTemplate().getForObject(url, Boolean.TYPE);

        return "blocking...";


    @GetMapping(value = "/asyncBlockingRequestProcessing")

    public CompletableFuture<String> asyncBlockingRequestProcessing(){

        return CompletableFuture.supplyAsync(() -> {

            logger.debug("Async Blocking Request processing Triggered");

            String url = "http://localhost:8090/sleep/1000";

            new RestTemplate().getForObject(url, Boolean.TYPE);

            return "Async blocking...";



Both services above are calling a RestService endpoint called  sleepingService.We can assume that the sleeping service has enough resources and won't be our bottleneck.

Also, I have set the number of Tomcat threads for this service to  1000. Our service will have only 10 to quickly reproduce scale issues.

Through this setup, we want to examine the performance of our blockingRequestProcessing service.

We can see that in the blockingRequestProcessing an external sleeping service is called, which would sleep for 1 second. Our service maximum number of Tomcat threads is 10. We can use JMeter to trigger 20 requests per second for 60 seconds. Overall, while all the Tomcat threads (10 in our case) are busy with processing requests, Tomcat holds the waiting requests in a requests queue. When a thread becomes available, a request is retrieved from the queue and is processed by that thread. If the queue is full, we get a "Connection Refused" error, but since I didn't change the default size (10,000 for NIO connector) and we inject only 1200 requests total (20 requests per second for 60 seconds), we won't see that. The client timeout is set in JMeter configuration at 60 seconds. These are the results from JMeter:

Many of the clients received timeouts. Why? JMeter calls 20 requests per second, while our service can process 10 requests every 1 second, so we accumulate 10 requests in the Tomcat requests queue every second. Meaning, at second 60, the requests queue holds at least 600 requests. Can the service process all the requests with 10 threads in 60 seconds (the client timeout)? The answer is no.

Let's run the same test with the same code, but return  CompletableFuture — this will, hence, make use of async servlet, as explained above, with the thread pool executor instead of String, as in the asyncBlockingRequestProcessing service.

Everything looks good. All requests were successful. I even reduced the response time . What happened? As mentioned before, returning Callable releases the Tomcat thread, and processing is executed on another thread. The other thread will be managed by the Spring MVC task executor that we have configured.

We actually improved performance by adding resources, i.e. number of threads from the Executor Thread Pool.

Note that the request to the sleeping-service is still blocking, but it is blocking a different thread (Spring MVC executor thread). Now, the question arises — could we also have increased performance without using async servlet API and by increasing tomcat max thread configuration for NIO connector? The answer is YES, but for specific use cases.

So, Where Could We Use the Servlet 3.0 Async Feature?

Servlet 3.0 async is really useful if the processing request code uses a nonblocking API (in our case, we use blocking API to call the other service), as shown in below sample code.

@GetMapping(value = "/asyncNonBlockingRequestProcessing")
    public CompletableFuture<String> asyncNonBlockingRequestProcessing(){
            ListenableFuture<String> listenableFuture = getRequest.execute(new AsyncCompletionHandler<String>() {
                public String onCompleted(Response response) throws Exception {
                    logger.debug("Async Non Blocking Request processing completed");
                    return "Async Non blocking...";
            return listenableFuture.toCompletableFuture();

In the above code, we are making use of the AsyncHttpClient, which calls the sleeping service in a non-blocking way. Hence, with the use of minimal threads here, we could scale our service to serve many more clients concurrently.

The benefit of releasing Tomcat threads is clear when it comes to a single Tomcat server with a few WARs deployed. For example, if I deploy two services and service1 needs 10 times the resources as service2, Servlet 3.0 async allows us to release the Tomcat threads and maintain a different thread pool in each service as needed.

With this, we will conclude our discussion on Servlet 3.0 Async feature. We have seen that this feature has changed the way applications are designed, and this would act as a solid foundation for Spring Reactive. Stay Tuned for the next article on this!

Source code for this article can be found here.

java, nio, servlet 3.0, spring reactive

Published at DZone with permission of Naveen Katiyar , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}