Over a million developers have joined DZone.

How to Monitor Your Kubernetes Cluster

In this blog post, the author will discuss how to monitor a Kubernetes cluster with CoScale.

See Gartner’s latest research on the application performance monitoring landscape and how APM suites are becoming more and more critical to the business, brought to you in partnership with AppDynamics.

In a previous blog post, I already explained how to monitor Docker containers with CoScale. Once your Dockerized application grows, an orchestrator almost becomes a necessity to manage and scale your infrastructure. A popular choice for this is Kubernetes: initially designed by Google but now taken over by the Cloud Native Computing Foundation. Kubernetes provides a platform for automating deployment, scaling, and operations of application containers across clusters of hosts. Monitoring a large application that was deployed on a Kubernetes cluster can be challenging. In this blog post, I will discuss how to monitor a Kubernetes cluster with CoScale.

Installation of the CoScale Agent

To start, you need to get the CoScale agent running in your Kubernetes cluster. The easiest way to do this is to install it directly on the Kubernetes master node, but if you cannot install an agent on your master node, you can also install it via a Daemon Set. Once you have the CoScale agent installed on your Kubernetes cluster, enabling the Kubernetes plugin will automatically get detailed Kubernetes metrics and events, as well as additional metadata from the Kubernetes API. If you want to gather more detailed metrics about your containers themselves and the services that run inside, it is recommended that you run the CoScale agent on your minions with the Docker plugin. Check my previous blog post for more details.

Cluster Overview

Once you have installed the Kubernetes plugin on the master node, a number of default dashboards are created with the most important metrics from Kubernetes. A first dashboard gives you a high-level overview of your cluster itself. Here you can see how many containers are running on the different minions, as well as the total amount of containers in your cluster. We also show the number of replication controllers and services. The event timeline gives you a clear view when containers were created and killed in your cluster.


Cluster Details

A second dashboard shows the cluster from a different and more detailed perspective. This lets you see on which minions your containers are (or were) running. Solid lines connect a currently running container to the minion node, dotted lines a container that was running during the selected time window, but no longer at the end of it. You can filter on replication controller and on service, to focus on one specific application on your Kubernetes cluster. You can also select a container metric and decide the color coding of containers based on this metric. In the example below a container turns red when the CPU usage is above 90% and orange when it is between 75 and 90%. This allows you to easily pinpoint the containers that might be causing performance issues on your application.


Replication controllers and microservices visualizations

Another interesting way to view the containers in your cluster is grouped by replication controller or grouped by service. This view allows you to see the current replicas at a quick glance and lets you understand the container utilization of a particular microservice.


Replication Controller Errors

You also want to know if any of your replication controllers are missing replicas. These kinds of errors are shown in the replication controller overview widget shown below. When you hover over the errors, you can get further details on why some of the containers in your application are dying


Container Lifecycle

While the above visualization gives you a high-level overview of possible problems with our replication controllers, sometimes you also want to take a deeper dive into a specific replication controller or a service. For that reason, CoScale also shows you the container lifecycles per replication controller or service. This gives you detailed insight into the amount of running containers, failed containers, container start and stop times, etc.

In the example below, you can see that a bunch of new containers were started at 9 pm but they were killed the next day in the afternoon. These kinds of insights allow you to troubleshoot your Kubernetes cluster and see where you need to add additional containers to handle the load.



With the above visualizations, you can get a detailed overview of the container usage in your Kubernetes clusters, and quickly identify any problems. If you want to also get more detailed insight on the performance of the services running inside your containers, then have a look at my previous blog post.

The Performance Zone is brought to you in partnership with AppDynamics.  See Gartner’s latest research on the application performance monitoring landscape and how APM suites are becoming more and more critical to the business.


Published at DZone with permission of Matthew Demyttenaere, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}