It's refreshing when you read an article by an IT industry analyst that is thoroughly researched and insightful. So many posts and articles in the technology press seem to be just rewritten press releases, so we were happy to see an article by Scott M. Fulton III about the 3.2 release which called us out on something John Wetherillhad said several months ago about application autoscaling being "nearly impossible" for us to include in Stackato.
Well, clearly it wasn't impossible.
I can't let John take the rap for this, because it's something I had been saying too. Whenever I was asked why Stackato didn't do application auto scaling out-of-the-box, my standard response was something along the lines of "Scaling events are best left to the discretion of the people who built the application" or "Every application has different metrics for determining performance degradation." Basically, there's no magic bullet for automatic application scaling.
Here's the rationale I put forward: Each language, each application framework, indeed each application, has different choke points. Allocating more memory to an application that's hitting the Python global interpreter lock won't solve that particular problem. Spawning multiple JVMs is not going to help if all those JVM instances are all hitting out of memory errors. If the problem is with database latency, adding more memory or more instances is pointless.
All that said, for a large class of web applications increasing the number of application instances does help performance. The Stackato product management and development teams decided that there was no reason not to provide a simple mechanism to allow these applications to scale automatically in response to CPU load.
What Stackato Does for You
As of Stackato 3.2, you'll see an Instances tab in the Application view with an "Application Autoscaling Settings" section.
You can use the sliders to set a minimum and maximum number of instances, a range which Stackato will stay within when scaling your application up and down. Scaling events (up or down) are triggered based on the CPU thresholds set in the other slider. These are worth talking about in a bit more detail.
Stackato application containers (Linux containers at their most basic level) have a certain percentage of CPU cycles allocated to them depending on how many other containers are running on the host. CPU load is calculated based on each container's fair share of the VM's available resources.
Using the default settings as an example, if the application is using more than 80% of the container's CPU allocation, Stackato will create a new application container and its router will balance between all containers. It will continue to do this at a rate of one instance per minute untill the load falls below the upper bound or the maximum number of instances are reached.
If the application is mostly idle, running at less that 20% of the container's CPU allocation, Stackato will prune application instances until the load rises above the minimum, or the minimum number of instances is reached.
Stackato measures the load every 10 seconds, but will trigger scaling events (both up and down) at most once per minute. Stackato admins can tweak these settings and the overall application autoscaling behavior globally in the system configuration (i.e. 'kato config …').
That's it. Very simple settings for a simple mechanism for scaling apps up and down.
What You Still Need to Do
…but things are not always that simple. Different applications behave differently under load. Having a handy interface for scaling application instances does not absolve the developer of the responsibility to analyse the performance of the app in more detail. Luckily, there are a number of tools (like New Relic) that help you to get to the bottom of performance problems.
Use the profiling tools for whatever language or application framework you are using to find the choke points, places where spawning multiple instances might not help performance.
At Least We Said "Almost"
I'm glad we didn't say "It's impossible to do application autoscaling." As we were discussing this post, John came across this quote from Paul Buchheit, the creator of Gmail:
If someone says: "That's impossible."
You should understand it as: "According to my very limited experience and narrow understanding of reality, that's very unlikely.”
Perhaps that's a bit grandiose in the context of this feature. It was really just a situation where for 20% of deployed applications, automatic scaling would be "almost impossible" in a polyglot PaaS that supports many, many runtimes, frameworks and services. For the remaining 80% though, the problem just needed a bit of careful consideration and hard work by Eric Promislow. Kudos to him.
And kudos to Scott Fulton for paying such close attention to what we say and keeping us honest.
And shame on me for underestimating the problem solving abilities of the Stackato developers.