Resources Utilization in Reactive Services
Reactive programming yields great resource usage optimizations and achieves great throughput if you understand at least the basics so that you don’t weaken your service.
Join the DZone community and get the full member experience.
Join For FreeLet me start this post with a question. Imagine that a service returning a value who's fetching from another service (e.g. database) takes one second (for sake of the example, let’s assume it always takes 1 second):
@SpringBootApplication
@RestController
public class WebApplication {
public static void main(String[] args) {
SpringApplication.run(WebApplication.class, args);
}
@GetMapping("/value")
String fetchValue() throws InterruptedException {
TimeUnit.SECONDS.sleep(1);
return "42";
}
}
How many transactions per second can we get when we hit this service with 10 concurrent users?
I know you already have an answer but let’s be a good software engineer and measure instead of guessing. Let’s run Siege in benchmark mode with 10 concurrent users, each issuing 10 subsequent requests:
$ siege -b -c 10 -r 10 http://localhost:8080/value
Transactions: 100 hits
Availability: 100.00 %
Elapsed time: 10.05 secs
Data transferred: 0.00 MB
Response time: 1.00 secs
Transaction rate: 9.95 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 9.99
Successful transactions: 100
Failed transactions: 0
Longest transaction: 1.01
Shortest transaction: 1.00
We got 9.95 transactions per second, which is close to the theoretical maximum of 10 tps.
That was easy. Let’s make it a bit more interesting: how many tps will we get if we increase the number of concurrent users to 50?
$ siege -b -c 50 -r 10 http://localhost:8080/value
Transactions: 500 hits
Availability: 100.00 %
Elapsed time: 25.05 secs
Data transferred: 0.00 MB
Response time: 2.40 secs
Transaction rate: 19.96 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 47.87
Successful transactions: 500
Failed transactions: 0
Longest transaction: 3.01
Shortest transaction: 1.00
WHAT? It’s not even near to 50 tps. How is that even possible? Well, let me share a secret: I have set the max number of Tomcat worker threads to 20.
server.tomcat.max-threads=20
So now that we know what is the limiting factor let’s get rid of this custom worker thread limit and repeat the test with the default settings (200 worker threads in case of Tomcat 8.5):
$ siege -b -c 50 -r 10 http://localhost:8080/value
Transactions: 500 hits
Availability: 100.00 %
Elapsed time: 10.06 secs
Data transferred: 0.00 MB
Response time: 1.00 secs
Transaction rate: 49.70 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 49.95
Successful transactions: 500
Failed transactions: 0
Longest transaction: 1.01
Shortest transaction: 1.00
The actual numbers are not that interesting (yes, we went close to 50 tps) as the view of threads usage:
We start with close to 30 live threads in our service and when users’ requests hit it, we quickly reach 70 live threads and keep them alive for some time after the traffic is gone just in case they could be reused.
Keeping in mind that we’re limited by the number of working threads we can easily tell that once we exceed that number of requests we would start queuing (can you tell what would your services do when excessive hit rate lasts for longer periods of time?).
With such long-running tasks handled that way, we can easily make our service unresponsive — that’s not reactive at all, not resilient at all, and not any other buzzword-capable at all and no service in 2017 should be so dull. So let’s sprinkle our service with some reactive magic by replacing old handler method with a reactive one:
@GetMapping("/value")
Mono<String> fetchValue() {
return Mono.just("42")
.delayElement(Duration.ofSeconds(1));
}
We're using org.springframework.boot:spring-boot-starter-webflux
dependency instead of org.springframework.boot:spring-boot-starter-web
.
With all these reactive bits and pieces in place, let’s see what happens if we hit our service with 50 concurrent users:
$ siege -b -c 50 -r 10 http://localhost:8080/value
Transactions: 500 hits
Availability: 100.00 %
Elapsed time: 10.06 secs
Data transferred: 0.00 MB
Response time: 1.01 secs
Transaction rate: 49.70 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 49.99
Successful transactions: 500
Failed transactions: 0
Longest transaction: 1.02
Shortest transaction: 1.00
Almost 50 tps, same result as we had with non-reactive, plain-old Servlet-based example running on Tomcat with default 200 worker threads. It doesn’t look impressive unless you take a look at threads usage:
We start with around 20 live threads and then go up to 30 of them (number of used threads depends on the number of CPU cores you have). Not bad for handling 50 concurrent requests.
Can you take a guess on how many threads will be used for handling 250 concurrent users?
It’s still the same number of threads and we were able to get close to 250 tps.
$ siege -b -c 50 -r 10 http://localhost:8080/value
Transactions: 2500 hits
Availability: 100.00 %
Elapsed time: 10.08 secs
Data transferred: 0.00 MB
Response time: 1.01 secs
Transaction rate: 248.02 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 249.44
Successful transactions: 2500
Failed transactions: 0
Longest transaction: 1.03
Shortest transaction: 1.00
Of course, acheiving such good results was only possible because we were reactive all the way down the stack. Should we block the execution thread the results would be worse than the ones from the classic Servlet-based service. Moreover, we wouldn’t achieve such high tps rates with computation intensive tasks however typical services spend a lot of time blocking.
Having said that let me stress it once more, we were able to handle 250 concurrent requests with just a handful of threads. As the old Unix saying goes “less is more.”
As a side note: if you prefer functional-style programming you can replace Spring MVC-style handler method mapping (and get rid of @RestController
annotation as well) by a RouterFunction
definition that routes incoming requests to handler functions:
@Bean
RouterFunction<ServerResponse> routerFunction() {
return route(GET("/value"), request -> fetchValueHandler());
}
Mono<ServerResponse> fetchValueHandler() {
return ServerResponse.ok()
.body(fetchValue(), String.class);
}
Mono<String> fetchValue() {
return Mono.just("42")
.delayElement(Duration.ofSeconds(1));
}
but it does not make your service run faster nor use less threads.
Do you feel distracted by this side note about functional-style routing? I hope so, because I’ll ask one more question about achievable tps rate for the following example:
@GetMapping("/value") Mono<String> fetchValue() throws InterruptedException { TimeUnit.SECONDS.sleep(1); return Mono.just("42"); }
How many tps can we get when we hit this service with 100 concurrent users?
In this case, the tps rate is really scary, way worse than the one from the non-reactive, plain-old Servlet-based example. I hope you spotted the problematic part.
The problem is that we blocked on the execution thread and that means really low tps rates (how many [hyper-threaded] CPU cores do you have?) and lots of queued (or timed-out) requests.
As you can see, reactive programming can yield great resource usage optimizations and allow you to achieve better throughput — but you have to understand at least the basics of this approach so that you don’t bring your service down to its knees. And that’s only part of a reactive services story; there are other important pieces like request and response body serialization or backpressure just to name a few.
Published at DZone with permission of Kamil Szymański, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Trending
-
Implementing a Serverless DevOps Pipeline With AWS Lambda and CodePipeline
-
Unlocking the Power of AIOps: Enhancing DevOps With Intelligent Automation for Optimized IT Operations
-
Getting Started With the YugabyteDB Managed REST API
-
Top 10 Pillars of Zero Trust Networks
Comments