Containers Are Great!
You already know this, since you’re reading this report and this article, but even the most ardent fan of containers probably doesn’t understand just what they’ve enabled in the world of application development and application operations.
Essentially, they make it easier to deploy almost anything. Yes, there are the efficiency benefits of containers, inching closer to using (and paying for) only the resources you need. But it’s the ease of deployment of practically any application or IT system that can truly change the IT world. Unfortunately, being able to easily deploy almost anything means that it’s especially easy to deploy almost EVERYTHING.
Agility + Continuous Delivery + Technology Diversity = Operational Nightmare
While that ability, in itself, is nothing to worry about, the embraced concepts of extreme agility and Continuous Delivery from development teams — coupled with this insanely easy method of deploying any and everything — has put an onerous burden on the shoulders of IT Operations. As the CTO of a SaaS company, I live in this space between Development and Operations every day,
since I manage both.
Now what do we mean by everything? Of course, there are hundreds of different technology components that are being used to build applications quickly, more efficiently, and more responsive to users. But in reality, most of us are using some combination of about 80 things: different languages, database systems, middleware app servers, messaging technology, storage, web servers, and much more.
From the Development perspective, when you couple the enabled ability to use whatever technology platform and/or language you want with the drive to do things faster and more continuously, you get each team (and each developer) making the optimal decision for themselves regarding technology usage. So, while 60 percent of development uses MySQL for their database, 20% might be using SQL Server, another 20% might be using MongoDB, and the other 20% might be using another three different databases.
For Operations, the Developer’s “paradise” has become the Operator’s nightmare. It will no longer be sufficient to have a MySQL expert on the team with everyone else understanding the basics of configuring and maintaining a MySQL Server. Instead, everyone has to be able to handle whatever might come up with a whole host of different database systems.
Now multiply that problem by the number of systems in operation: security, web server, app server, storage, messaging, transactions, directories, search tools, and more. With all of this mass diversity, how can Ops even hope to cope?
Automation in Operations
Automation isn’t a panacea to fix all problems. But in this hyper-agile application world, it’s an absolute requirement. This isn’t the automation of the nineties, when we were all hypothesizing about dark room IT Ops. No, this is automation that’s required simply so your Operations team can actually do the things they need to do over the lifecycle of your applications.
What should be automated? Some would like to say “everything,” but let’s start with the most important aspects of Operations: the ability to deliver application services to end users at the proper scale and proper performance levels (or service levels). So, monitoring has to be one of the first things Operations should automate.
Automated Container Monitoring
Which brings us full circle back to those little containers. Not only do they make it easy to overrun the operations team with this mass diversity of systems, but they add a layer that’s difficult to penetrate when it comes to understanding how the overall system (or application) is performing.
The first thing to automate is understanding what’s actually running in a container (the app technology), how it fits within the overall application architecture (dependency map), and how both the container and the technology inside are executing their responsibilities to deliver the appropriate service levels.
The next step in automation is monitoring the health (again, both of the container and the thing that’s running inside it). This is best done by machines, whether an expert system (understanding what should be measured, and what different metrics mean to the overall health of the system) or further down the AI path to machine learning, where the monitoring system gleans patterns of execution, determines causal correlation between end-user service levels and individual component measurements, and other systemic ways to handle what would be an impossible task for human operators.
Speed Kills (Monitoring Effectiveness)
The other thing that happens with containerized applications, especially in an agile/Continuous Delivery environment, is the rate at which containers are provisioned and destroyed as they are needed (and then not needed). Traditional monitoring tools (15-minute metrics, 1-hour notifications) and APM tools (1-minute metrics, 5-minute notifications) are going to miss anywhere from 10-50% of containers that even execute.
And while container orchestration tools are a way to better manage the deployment of individual (or even sets of ) containers, these tools aren’t considering the overall performance of the application, nor can they take into account outside influences (or resource suckers) that make their allocations suboptimal.
A new breed of monitoring/performance management tools are needed that recognize both the overall system and the individual components running inside containers (and even a look at the containers themselves). These new tools should also be able to work hand-in-hand with orchestration and other container management systems to ensure that everything operationally works together to meet the purpose of the containerized apps in the first place — fantastic user experience