I still recall the summer of 2013 when I was running a project and it was 1 URL in my whole of application that brought the servers down. The problem was simple – a bot decided to index our site at a very high rate and the bot was creating a millions of URL combinations which bypassed my entire caching layer and they were all hitting my application servers. Well we had a very high cache rate in the application (95% or so) and the application server layer was not designed for a high load (it was Adobe AEM 5.6 and the logic to do searches and make pages was very computational heavy). Earlier that year we wanted to handle the case of the Dog-Pile effect and we had spoken about having some sort of throttling in place. At the start of the conversation every one frowned about the idea of throttling (except 2 people).
In fall of 2012 Ravi Pal had suggested that we have error handling in place so that a system should not just fall on it’s head but degrade gracefully. I only realized the gravity of his suggestion when we hit this problem in 2013.
Now I am here working on yet another platform and the minute I bring up the idea of throttling, it’s being frowned upon again. One guy actually laughed at me in a meeting. One other person suggested that we want to handle the scenario by “Auto-scaling” instead of throttling. We have our infrastructure on AWS Cloud and I am not an expert, but the experts tell me a server can be replicated as-is in around 10 minutes (we will be benchmarking this very soon).
I was an ambitious architect who though I controlled the traffic coming to my site. I no longer live under that illusion.
This may be a series of posts, but today I want to start off with showcasing that you do not have a choice and whether you like it not, the system will throttle your traffic for you.
The Benchmark Overview
- A simple Web application built using Spring Boot
- A Spring MVC REST controller that will accept some HTTP Requests and send back an OK response after a induced delay
- jMeter to simulate a load
- A custom plugin (a big shoutout to these guys for the plugin) to generate stepped load and capture custom enhanced graphs
- Tomcat 8.x to host the web site – launched in memory using Spring Boot. No customizations done
First Groups – The Good One
This Thread Group is going to simulate a consistent stream of requests to our application server. A typical scenario that happens very often.
As Expected? Yes.
As you see below, the chart shows that the application server is behaving normally. All the requests over a time period of 15 minutes is consistent with a “single user model” aka 1 second request response time.
Second Group – The Sudden High Traffic
This test plan is a stepped approach and it’s trying to simulate a scenario where a campaign will start hitting a certain page (or set of pages) for a short duration. This is a use case we see most often in the industry where our websites are open to the whole world to hit.
This thread group is not OOTB and I downloaded a plugin
So what do we expect to happen? Depending on how much juice my server has (threads, cpu cycles, etc.), my server may or may not be able to handle the requests. Given I am running everything on my local laptop, it will be interesting if my local box can handle 600 threads.
And we see that my laptop can't really handle 600 threads. So what does Tomcat do?
How the Good One's Changes Behave
I run the 1st test plan and follow it up with the high traffic plan (introducing a 30 second delay).
The following image shows how the Good One has been impacted. While the traffic for The Good One has not changed a bit, it has still been impacted because something else introduced a spike.
Please go and tell the JVM that you do not like throttling
So What’s Next
You really have only 3 choices (we will look into the details of each of the choices in another post).