How to Monitor Your Kubernetes Cluster
In this blog post, the author will discuss how to monitor a Kubernetes cluster with CoScale.
Join the DZone community and get the full member experience.Join For Free
In a previous blog post, I already explained how to monitor Docker containers with CoScale. Once your Dockerized application grows, an orchestrator almost becomes a necessity to manage and scale your infrastructure. A popular choice for this is Kubernetes: initially designed by Google but now taken over by the Cloud Native Computing Foundation. Kubernetes provides a platform for automating deployment, scaling, and operations of application containers across clusters of hosts. Monitoring a large application that was deployed on a Kubernetes cluster can be challenging. In this blog post, I will discuss how to monitor a Kubernetes cluster with CoScale.
Installation of the CoScale Agent
To start, you need to get the CoScale agent running in your Kubernetes cluster. The easiest way to do this is to install it directly on the Kubernetes master node, but if you cannot install an agent on your master node, you can also install it via a Daemon Set. Once you have the CoScale agent installed on your Kubernetes cluster, enabling the Kubernetes plugin will automatically get detailed Kubernetes metrics and events, as well as additional metadata from the Kubernetes API. If you want to gather more detailed metrics about your containers themselves and the services that run inside, it is recommended that you run the CoScale agent on your minions with the Docker plugin. Check my previous blog post for more details.
Once you have installed the Kubernetes plugin on the master node, a number of default dashboards are created with the most important metrics from Kubernetes. A first dashboard gives you a high-level overview of your cluster itself. Here you can see how many containers are running on the different minions, as well as the total amount of containers in your cluster. We also show the number of replication controllers and services. The event timeline gives you a clear view when containers were created and killed in your cluster.
A second dashboard shows the cluster from a different and more detailed perspective. This lets you see on which minions your containers are (or were) running. Solid lines connect a currently running container to the minion node, dotted lines a container that was running during the selected time window, but no longer at the end of it. You can filter on replication controller and on service, to focus on one specific application on your Kubernetes cluster. You can also select a container metric and decide the color coding of containers based on this metric. In the example below a container turns red when the CPU usage is above 90% and orange when it is between 75 and 90%. This allows you to easily pinpoint the containers that might be causing performance issues on your application.
Replication controllers and microservices visualizations
Another interesting way to view the containers in your cluster is grouped by replication controller or grouped by service. This view allows you to see the current replicas at a quick glance and lets you understand the container utilization of a particular microservice.
Replication Controller Errors
You also want to know if any of your replication controllers are missing replicas. These kinds of errors are shown in the replication controller overview widget shown below. When you hover over the errors, you can get further details on why some of the containers in your application are dying
While the above visualization gives you a high-level overview of possible problems with our replication controllers, sometimes you also want to take a deeper dive into a specific replication controller or a service. For that reason, CoScale also shows you the container lifecycles per replication controller or service. This gives you detailed insight into the amount of running containers, failed containers, container start and stop times, etc.
In the example below, you can see that a bunch of new containers were started at 9 pm but they were killed the next day in the afternoon. These kinds of insights allow you to troubleshoot your Kubernetes cluster and see where you need to add additional containers to handle the load.
With the above visualizations, you can get a detailed overview of the container usage in your Kubernetes clusters, and quickly identify any problems. If you want to also get more detailed insight on the performance of the services running inside your containers, then have a look at my previous blog post.
Published at DZone with permission of Matthew Demyttenaere, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.