Canary Deployment of Applications on Kubernetes Using Spinnaker
A powerful deployment strategy using Spinnaker.
Join the DZone community and get the full member experience.Join For Free
In the last few years, there are many Continuous Delivery tools launched in the market. Today, most of the companies are looking for a deployment cycle that takes very little time in a safe and disciplined manner. The Customer needs to identify the operational needs of their application deployment process and evaluate the appropriate CD tool to achieve the best deployment strategy
CI/CD Without Spinnaker (Problem Statement)
CI tools such as Jenkins provides extensive support for Continuous Integration, but when it comes to Continuous Delivery, especially where frequent canary deployment happens in a large-scale cloud platform, it always depends on some third-party tools like Ansible/Puppet which requires complex pipeline scripts with lots of customization. Many companies are struggling to achieve the flawless canary deployment strategy due to lack of Automation skills.
Spinnaker is an open-source multi-cloud supported CD tool that provides phenomenal support in a very large-scale cloud deployment environment. Spinnaker can be used in Continuous Delivery and Continuous Deployment platforms. It simplifies the automation tasks, it has native support for deploying applications on Kubernetes clusters. It supports multiple cloud platforms such as AWS/GKC/Azure/Oracle Cloud Infrastructure/Cloud Foundry/Open stack/Kubernetes with a lot more attractive features.
Why Spinnaker is Different
- The best part of Spinnaker is, open-source with many attractive features such as CLI setup, CI integrations, and pipeline management.
- Spinnaker deployments are smooth and very quick compared to other CD tools. It provides ease of support for rolling deployments in a large-scale environment. Spinnaker has built-in features of Blue/Green and Canary deployment strategies which does not exist in other tools like Puppet/Chef/Ansible/Salt/CFEngine.
- Spinnaker is mainly aimed at the cloud-native, it was actually designed to support multi-cloud architectures (to avoid vendor lock-in) and has a slight advantage over any other CD tool. It was originally developed by Netflix.
There are two types of deployment strategies. Spinnaker supports both red/black (also known as blue/green) strategy and canary deployment strategy. These strategies are inbuilt within the spinnaker application. Let us deep dive into canary deployment using Spinnaker.
Canary Deployment Strategy
What is Canary Deployment?
Canary deployments are a pattern of rolling out releases to a subset of users or servers. The idea is to first deploy the change to a small subset of servers, test it, and then roll the change out to the rest of the servers. Once all the health check is passed, when there are no complaints, then the complete customers will be routed to the new version of the application and the old version will be deleted.
There are four images shown above. Consider there are two application versions (V1.0 & V1.1). Image (1) shows that 100% of traffic is routed to the older application version (V1.0) which was already deployed in production. Image 2 shows that 90% of traffic is routed to the older application version (v1.0) and 10% of traffic is routed to the new application version (v1.1). Image 3 shows that 50% of traffic is routed to the old and new app versions equally. Image 4 shows that 100% traffic is completely routed to the new application version (v1.1) whereas there is no traffic at all to the older app version (v1.0). This is called canary deployment.
Canary analysis will do the comparison between the old (v1.0) and new version (v1.1) of your application. The differences can be subtle and might take some time to appear. You might also have a lot of different metrics to examine. It needs to perform the canary analysis based on the application metrics that is configured in the tool.
Spinnaker Workflow Overview
This below workflow diagram explains CI/CD deployment model in Kubernetes using Spinnaker.
As displayed in the above image, the developer changes the code and pushes it to a Git repository. Jenkins Build detects the changes, builds the Docker image, tests the image, and pushes the image to Docker registry or any other private repository. Spinnaker detects the image by an automated trigger, initiate canary deploys, and perform functional tests of Canary deployment. After a manual approval or once the canary analysis is succeeded, Spinnaker deploys the image to production by disabling the old application version.
Spinnaker is a pure Continuous Deployment tool, we should avoid comparing with Jenkins or any other CI tool. Jenkins and Spinnaker combination gives wonder to your deployment process if it is in multi-cloud and large-scale deployment platform. The Spinnaker pipeline script can be also triggered from Jenkins through Spinnaker API calls.
Spinnaker receives the binary from Docker registry or any other private registry such as Nexus/Jfrog/Artifactory, you need to integrate the respective registry with spinnaker. These pipeline scripts can be also triggered automatically by polling the changes from one of these registries or whenever change happens on the manifest file in the GitHub.
Spinnaker is an extremely powerful open-source platform for managing software deployments. It supports many different platforms and deployment methodologies, for the purpose of this blog we are going to run it on top of a managed Kubernetes cluster in a VM. We will use Prometheus for collecting metrics.
For clarification as we are dealing with relatively advanced concepts I won’t be really going into detail about what Kubernetes and containers are, How to setup Spinnaker and its components. There is plenty of information available online about these technologies. I’ll also mention here, this is a pure how-to article, with the aim of giving you a platform to test out canary deployments on Spinnaker.
Canary Deployment Prerequisites
- Kubernetes cluster environment (one for staging and one for production)
- Spinnaker with Application setup/Pipeline script(manifest)
- Prometheus for metrics pick (you can use Istio also for the same feature)
- Storage service (S3, GCS or Minio) to store the configuration data
- Application Images to perform the canary deployment (from Docker or some other Registry)
Setting up Canary in spinnaker
There are few Halyard commands to enable canary analysis in Spinnaker
Please refer this link to setup canary analysis.
Before you can use the canary analysis service, you must configure at least one metrics service, and at least one storage service. The most common setup is to have one metrics service configured (e.g. Stackdriver, Atlas, Prometheus, Datadog or New Relic) and one storage service (e.g. S3, GCS or Minio) configured. For this exercise, let us use Minio as a storage provider & Prometheus as a metrics provider.
Installing Prometheus & Enabling Canary Analysis
Let’s install Prometheus on Kubernetes Master server. We will use a Helm chart to install this.
Run the following command,
- helm install --name stable-prometheus stable/prometheus-operator
You can see the Prometheus services such as (Grafana/Kube state metrics/node exporter) running on the server upon this above command completion.
Setting Up Prometheus
- docker exec -it halyard /bin/bash (Login in to Halyard Container)
- hal config canary enable (To enable canary)
- hal config canary edit --default-metrics-account prometheus (Adding metrics account)
- hal config canary edit --default-storage-account minio (Adding storage)
- hal config canary prometheus enable (To enable canary analysis)
You can see the Prometheus main service is running as shown below,
bash-5.0$ kubectl get svc | grep prometheus
prometheus-operated ClusterIP None <none> 9090/TCP 28m
prometheus-prometheus-oper-prometheus ClusterIP 10.106.237.114 <none> 9090/TCP 30m
To access the Prometheus URL from browser,
Please run the below command to change the ClusterIP to NodePort IP type,
bash-5.0$ kubectl edit svc prometheus-prometheus-oper-prometheus
bash-5.0$ kubectl get svc | grep prometheus
prometheus-operated ClusterIP None <none> 9090/TCP 31m
prometheus-prometheus-oper-prometheus NodePort 10.106.237.114 <none> 9090:32734/TCP 32m
You can see a new port number (32734) got assigned automatically to access the prometheus dashboard from the browser. You can also define this port number in the given range by kubernetes.
To add canary account with Prometheus,
- hal config canary prometheus account add prometheus --base-url http://192.168.110.23:32734
- hal config canary prometheus account list (To verify the account)
- hal deploy apply (deploy the changes made so far)
Prometheus URL is http://192.168.110.23:32734
Prometheus is just a metrics provider tool. As displayed in the above Prometheus homepage, you can see the default metrics lists. There are more than 400 metrics available within Prometheus as listed above. These metrics will be used for performing canary analysis. Your QA team can define the required metrics based on your application function/requirements. Multiple metrics can also be selected for a single application. Please explore more about it as we are not going to cover the same.
Similarly, you can compare the different set of metrics available in the Istio metrics tool. Istio can be also used to route the traffic with the help of load balancer. In most of the production environment, they are using Istio-Prometheus-Spinnaker combination.
For this demo, We are going to use Prometheus Metrics for performing Canary analysis.
I request to go through the steps of Spinnaker Project/Application creation process, Pipeline creation, Adding new stage & deployment configurations in spinnaker. Let us see demo (screenshots) of simple “helloworld” application configuration and it’s canary deployment process.
Spinnaker Dashboard Homepage
Select one of the applications named “helloworld”, and it will take you to the below page.
You can see the Clusters/Load balancers/Firewall and Instance details where the actual deployment happens. These features are inbuilt within spinnaker and can be customized according to your requirements/resources. By default, Kubernetes will serve the traffic in Round Robin method if there is no other external Load Balancer is configured.
Click Delivery & then click Pipelines, it will take you to the below page.
As you can see in the above screenshot, there are 2 types of deployment pipelines (marked in blue) created named as below,
- deploy_to_kubernetes (normal or Blue/Green deployment, manifest file based)
- Canarydeploy (Canary deployment)
Let us see these above pipeline scripts in detail.
- deply_to_kubernetes pipeline – Click this pipeline script & then click Configure button, it will take you to the below page. You can see a pipeline stage named as “Deploy (Manifest)” which was created already (by clicking on top of initial Configuration & then by clicking add stage option) for normal deployment, Click on Adding new stage will take you to the below page.
Select the account details (created during Kubernetes Cluster setup), Application name & choose “Text” as we are going to do the deployment manually by using this manifest script.
Below is the pipeline script pasted in the given box for deploying the helloworld application. This script contains ConfigMap/services/deployment resource details. Please explore more on the yaml file format. This is needed to understand the resources & declaring the same. It can be placed in GitHub & can be triggered automatically whenever there is a change in the manifest file by triggering scm polling.
Pipeline Script Content
Save the changes and go to the PIPELINES page again,
Click “Start Manual Execution”, your new pipeline script will do the deployment based on the input parameters (deployment/services/ConfigMaps/Replicaset) given in the pipeline script.
When the script is running, you can see the blue colour horizontal long bar with the status shown as “RUNNING”, it means the deployment is in progress and the instance is getting prepared for deployment. Once the deployment is done, it will be changed as shown below,
You can see the color has been changed from Blue to Green which confirms the deployment of your application has been successfully completed.
Click Execution Details to see the status of your deployment as shown below,
As shown in the below screenshot, under CLUSTERS tab, you can see the number of replica sets (3 pods) running which were specified in the manifest file,
This completes your manifest file-based application deployment process. You can see your application service (spinnaker-demo) is running on these 3 replica pods (mentioned in the manifest pipeline script) as highlighted in the below screenshot,
Now you can access your Application services (endpoints) from browser with the port number (30010) specified in the pipeline script. Application will be running in kubernetes three different Pods named as (7kvpc, p8cch & tk7r7) highlighted in the above image.
Below screenshot shows the service access page output from the browser,
Let us see how to deploy the same application “helloworld” in Canary strategy method, We should have already deployed running production application for doing canary analysis. That’s the reason of the above normal deployment process completion.
2. Canarydeploy (Canary deployment)
Click Pipelines & then select “Canarydeploy” pipeline script which was already created and available as shown below,
The "Canarydeploy" pipeline script performs the canary deployment by running the below given stages in the following order shown in the image, Naming of these stages are customizable according to your understanding,
- GetBaseline (Tagging baseline on already running production Image)
- DeployCanary (Deploy new Application version, named as canary)
- Canary Analysis (Perform canary analysis based on the metrics selection from prometheus)
- Promote Canary (Promoting to production if canary analysis meets the requirements by achieving its threshold value/health check parameters)
- Delete (Manifest) – Delete the old version of application if Canary analysis is successful.
Let us see how these stages were created,
1. Click on “GetBaseline” stage, (stage name can be defined on your own)
Select “Find Artifacts from the resource (Manifest) configuration,
In the above stage, we are selecting the same input parameters as already running production application version (v1.0) & Tagging the same with baseline (spinnaker-demo). There is no pipeline script needed for this stage. It is just a copy of the existing production image with the new baseline was created.
2. DeployCanary (Deploy new Application version v2.0, named as canary)
Select GetBaseline & Add new stage called “DeployCanary” with below parameters,
Please select depends on baseline as previous stage “GetBaseline”
In the manifest configuration, select text & copy the below scripts with few changes (name & version) compared to the previous deployed production pipeline script.
On execution of this above pipeline script, you can see your newly created application image version (v2.0) is running on single replica pod as mentioned in the pipeline script. Try to access the service from browser as we are not going to repeat it here. Now it is time to do the comparison between this newly deployed version to the old deployed version of canary analysis.
3. Canary Analysis
For Canary Analysis, first we need to add the metrics from Prometheus.
Click on “CANARY CONFIGS” & then select “Add configuration”, and you will get a new page to enter the metrics details shown as below.
Edit the newly created metrics “canary-new-config”, you will get the below page.
Provide the details and in the metrics Name, you need to select the appropriate metrics for doing canary analysis of your application. For this demo, we have selected “kube_deployment_labels”. It does the comparison of already deployed (baseline) instance application version (v1.0) with newly deployed canary version (v2.0), then it performs the health checks/liveliness, network functionality, few more basic checks and gives reports/graphs in UI dashboard. Then it will promote to the next stage If canary analysis reports are passed,
To add “canaryanalysis” stage,
Click on deploycanary & add select new stage type as “Canary Analysis”
Provide the required parameters for your application to perform canary analysis as shown in the below image,
Here you need to select the metric config file (canary-new-config) that you created during metrics configuration.
Lifetime is nothing but, it does the canary analysis at the given time, I have provided 2 mins as it is created for demo purpose. Usually Lifetime will be little high based on the testing duration of your application.
You need to select the baseline & canary name as shown in the above image. This is very important for canary analysis to compare the difference between canary deployed application version (v2.0) & the baseline versioned application version (v1.0). These namespaces (baseline+canary) must have been updated in the respective pipeline scripts as well.
In the above screenshot, you can specify the threshold value. (For example I provided 50% to 75%).
If the canary analysis success rate is in-between this specified range, it will promote to the next stage. This value can be adjusted according to your application testing need.
4. Promote Canary (Promoting to production if canary analysis meets the requirements by achieving its threshold value/health check parameters)
To create this stage,
Click on Canary Analysis & add new stage name as “Promote Canary”, it should depends on Canary Analysis stage shown in the below image,
In the above text, copy and paste the below script content,
This script will have the same application version (v2.0) as Canary deploy stage application version (v2.0). This will be promoted to the production stage if canary analysis is successful
Now your new application version (v2.0) is deployed in the production environment based on the configuration given in the manifest file. Once the promotion is successful, it will go to the next stage & delete the previously deployed application version (v1.0)
5. Delete (Manifest) – Delete the old version of application if Canary analysis is successful.
Select Canary Analysis & create new stage type as “Delete (Manifest)”
Once all the stage is created as described above, you can start the Canary deployment pipeline script manually,
Go to pipelines, run the “start manual execution” on the canarydeploy pipeline script, it executes each stage depending on the next stage.
Below is different stages screenshot during the execution of canarydeploy pipeline script,
So we have completed the Canary deployment on Kubernetes using Spinnaker with these stages configured and when you access your application service from a browser, your new version of application will be running. You can check it from the browser to verify the same.
CANARY REPORTS page will show the overall information about canary deployment as shown below,
The above dashboard also show the canary analysis report in Graph and other format as well. This page shows detailed information about canary deployment results.
Important Features of Spinnaker
- Safe deployments
- Features in pipeline creation (Including Automated Triggers)
- Rollback (including Automatic Roll back if deployment fails)
- Manual Judgement (can be also added as a stage in Canary deployment)
Below link walks through the above topics,
- Spinnaker is a heavyweight application as it is designed to support large scale deployment environment. It is composed of microservices which can make deploying and managing the platform difficult and require a lot of compute power to run. If your company is mature in its infrastructure capabilities than this tool makes wonder but might not be right for smaller groups
- Avoid using the ad-hoc “edit” features, Spinnaker provides quick ways to edit your deployed Manifests in the infrastructure screen. This is done to provide you a quick fallback when mitigating a broken rollout, or to increase the number of pods serving traffic.
- Setting up spinnaker is very challenging task, Need to know more on Halyard & how it connects with other spinnaker components, especially, if you have some tricky and specific setup. Windows based systems finds it difficult to onboard to Spinnaker.
Links for Reference
Docs Reference@ https://www.spinnaker.io/guides/tutorials
Official Slack Channel@ https://join.spinnaker.io/
Community page@ https://blog.spinnaker.io/
Stack overflow support@ https://stackoverflow.com/questions/tagged/spinnaker
Configuring Deployment scenarios in production@K8s,
Canary setup@ https://www.spinnaker.io/guides/user/canary/stage/
Blue/Green deployment@ https://spinnaker.io/guides/user/kubernetes-v2/traffic-management/#route-traffic-during-a-deployment-bluegreen
Spinnaker provides extensive support for canary deployment. It provides an in-house bakery service, which helps in immutable deployments. Rollback/resize of clusters is one of the coolest features of Spinnaker. Both high level and low-level view of clusters, which has fine-grained options to control cloud infra from Spinnaker UI itself. It is a strongly recommended open-source tool for doing Blue/Green or canary deployments in a multi-cloud platform in a large-scale environment.
Opinions expressed by DZone contributors are their own.
Integrating AWS With Salesforce Using Terraform
Effortlessly Streamlining Test-Driven Development and CI Testing for Kafka Developers
Knowing and Valuing Apache Kafka’s ISR (In-Sync Replicas)
SRE vs. DevOps