Targeting Kubernetes Cluster With Gremlin Chaos Test
Targeting Kubernetes Cluster With Gremlin Chaos Test
This tutorial helps to produce the requirements and create a scenario to "simulate an attack with Gremlin".
Join the DZone community and get the full member experience.Join For Free
Gremlin is a leading software company focusing on chaos-test in the market. It also has a tool similar to Chaos Monkey which belongs to Netflix, but is more customized to test the system with random loads or scheduled shutdowns. In the article below, we will be testing a simple Kubernetes cluster running on EKS with Chaos Test.
Why Is Chaos Testing Important?
Chaos Engineering is used to improve system resilience. Gremlin’s “Failure as a Service” helps to find weaknesses in the system before problems occur.
To successfully experience Chaos Engineering with Gremlin, we have two requirements: a running EKS cluster and two applications deployed to EKS. This tutorial helps to produce the requirements and create a scenario to "simulate an attack with Gremlin".
- Step 1 - Prepare Cloud9 IDE
- Step 2 - Create an EKS cluster using eksctl
- Step 3 - Deploy Kubernetes Dashboard
- Step 4 - Install Gremlin using Helm
- Step 5 - Deploy a Microservice Demo Application
- Step 6 - Run a Shutdown Container Attack using Gremlin
1. An AWS account
2. A Gremlin account which can be registered from here
Step 1 - Prepare Cloud9 IDE
Firstly, let's start to create the Cloud9 environment. Login to your AWS account and navigate Cloud9 to service page. Click on Get Started and type the name with anything, in this example we have chosen chaous gremlin. Keep all default settings as they are stated since it is needed only to reach EKS resources.
Wait for a while for the new console to build. Then close all terminals to open a new one.
To start creating the cluster, firstly check whether AWS CLI is installed or not with the command below:
Step 2 - Creating an EKS cluster using eksctl
We will use
eksctl to create our EKS clusters.
Confirming whether the
eksctl command works:
Below we are creating the cluster named gremlin-eksctl with three EC2 nodes. Just a word of warning - EKS can cost a lot so please do not forget to delete your resources after you have done with your failure test.
It might take around 15-30 minutes to get ready which you can cluster on the EKS service page.
Quit Tip - Fee of EKS is 0,20$/h and fee of EC2 with a m5.large instance type that EKS runs on is 0.096$/h. Estimation of total cost per day will be around 11-12$.
Checking the cluster whether it is working and getting its status. As expected, there is only one cluster created.
kubeconfig file by giving the cluster name and region via AWS CLI tool.
On checking we see 3 nodes in the cluster. Two of them are worker-nodes while the third is cluster management master-node.
Step 3 - Deploying Kubernetes Dashboard
Next we would deploy the Kubernetes dashboard to Kubernetes cluster by using Heapster and InfluxDB. These two tools will help our sample application to be shown in the dashboard. We will start with deploying our Kubernetes dashboard as the first step.
Heapster is a performance monitoring and metrics collection system compatible with Kubernetes (versions 1.0.6 and above). It allows for the collection of not only performance metrics about your workloads, pods, and containers, but also events and other signals generated by your cluster. The great thing about Heapster is that it is fully open source as part of the Kubernetes project, and supports a multitude of backends for persisting the data, including but not limited to, Influxdb, Elasticsearch, and Graphite.
InfluxDB is a time series database designed to handle high volume of writing and query loads.
- Deploying the Kubernetes Dashboard to your EKS cluster:
Creating Heapster cluster role binding for the Dashboard.
The next step is to create an
eks-admin service account. It will let you connect to the Kubernetes Dashboard with
To authenticate and use the Kubernetes Dashboard:
To access the Kubernetes Dashboard:
- In your Cloud9 environment, click Tools > Preview > Preview Running Application to open the Dashboard URL.
- Append the following to the end of the URL:
Select Token and then copy the output of the command above and paste it to the text field as shown below:
Step 4 - Installing Gremlin using Helm
Download your Gremlin certificates:
Start by signing-in to your Gremlin account. If you don't have one, create an account here. Navigate to Team Settings and click on your Team. Click the Download button to download and save certificates to your local drive. Please note that the downloaded certificate.zip contains both a public-key certificate and a matching private key.
Unzip the certificate.zip and save it to your Gremlin folder on your desktop. Rename your certificate as gremlin.cert and key files as gremlin.key.
Creating Gremlin Namespace:
Create a Kubernetes Secret for your certificate and private key, copy gremlin.cert and gremlin.key to Cloud9. A quick tip is to create these by the Vim Editor instead of copying from your local computer.
Check the files on Dashboard whether they are deployed.
Installation With Helm
The simplest way of installing the Gremlin client on your Kubernetes cluster is to use Helm. Once Helm is installed and configured, the next steps will be to add the Gremlin repo and to install the client.
Installing Helm source code and making it executable:
Configuring Helm to access with RBAC
Helm relies on a service called Tiller which requires special permission on the Kubernetes cluster, for which we will need to build a Service Account for using Tiller. Next step is to then apply this RBAC to the cluster.
Creating a new service account:
Installing Tiller for Helm:
A companion server component,
tiller, that runs on your Kubernetes cluster, listens for commands from
helm, and handles the configuration and deployment of software releases on the cluster.
This will install tiller into the cluster and will give access to managed resources in your cluster. Please note the security policy alert as shown above, which you can feel free to ignore or follow as per your policy settings.
Activating bash-completion for Helm:
To run the Helm install, you will need your Gremlin Team ID. It can be found in the Gremlin app on the Team Settings page, where you downloaded your certificates earlier. Click on your Team in the list. The ID you’re looking for can be found under Configuration as Team ID.
Export your Team ID as an
Next, export your cluster ID, by giving a name for your Kubernetes cluster.
Now add the Gremlin Helm repo, and install Gremlin:
Step 5 - Deploying a Microservice Demo Application
The demo environment we are going to deploy on to our EKS cluster is the Hipster Shop: Cloud-Native Microservices Demo Application
Clone repo of app source code:
Change directory to the one just created:
Deploying the application:
Wait until pods are in a ready state.
Getting the frontend IP address:
Visit the URL on your browser:
Step 6 - Running a Shutdown Container Attack using Gremlin
We are going to create our first Chaos Engineering experiment where we would validate the EKS reliability. Our hypothesis is, “After shutting down my cart service container, we will not suffer from downtime and EKS will give us a new one.”
Going back to the Gremlin UI, select Attacks from the menu on the left and select New Attack. We’re going to target a Kubernetes resource, so click on Kubernetes on the upper right.
Choose State and Shutdown:
Attacking this pod with our Gremlin UI:
We will be shutting down the cartservice containers. As a test, we attacked twice and the cartservice pods restarted itself. Which signifies that it is working as expected. Note that it re-generates itself even when you attack to shut down your pods.
When we attacked our containers, the cluster resisted to failure and restarted itself, which symbolizes that our system is now resistant to failure. We have seen what happens when a failure occurs, in this example the failure is shutting down the pods. As a result, we understand that our cluster already has auto-scaling feature.
As a reminder; do not forget to delete your cluster and Cloud9 ide.
Congrats! You’ve installed an AWS EKS cluster, deployed the Kubernetes Dashboard, deployed a microservice demo application, installed the Gremlin agent as a daemon-set, and ran your first Chaos Engineering attack to validate Kubernetes reliability!
Published at DZone with permission of Sudip Sengupta . See the original article here.
Opinions expressed by DZone contributors are their own.