Canary Deployment, Constraints, and Benefits
Widely used with lower risk of moving changes into production while reducing the need for additional infrastructure.
Join the DZone community and get the full member experience.Join For Free
Canary was an essential part of British mining history: these humble birds were used to detect carbon monoxide and other toxic gases before the gases could hurt humans (the canary is more sensitive to airborne toxins than humans). The term “canary analysis” in software deployment serves a similar purpose. Just as the canary notified miners about any problems in the air they breathed, DevOps engineers use a canary deployment analysis to gauge if their new release in CI/CD process will cause any trouble to business.
You can consider the following general definition of a canary release deployment: canary deployment is a technique to reduce the risk of introducing a software update in production by slowly rolling out the change to a small subset of users before making it available to everybody.
When and How Is Canary Deployment Useful?
For example, a product company that sells its products online cannot afford a single moment of application failure. That would mean a significant loss of revenue and a big increase in irate customers. At the same time, the company wants to continually update and enhance its software and services – providing a better customer experience. However, every change carries the rest of causing errors in production. How do you reduce the risk of error so you can update your application confidently? One approach is the canary deployment strategy.
In a simple structure the canary deployment has three stages:
- Plan and Create: The first step involves creating a new canary infrastructure where the latest update is deployed. A small amount of traffic is sent to the canary instance, while most users continue to use the baseline instance.
- Analyze: Once some traffic is diverted to the canary instance, the team collects data: metrics, logs, information from network traffic monitors, results from synthetic transaction monitors – anything that helps determine whether the new canary instance is operating as it should. The team then analyzes this data, comparing it to the baseline version.
- Roll: After the canary analysis is completed, the team decides whether to move ahead with the release and roll it out for the rest of the users or roll back to the previous baseline state.
Drill down further on Canary analysis in this blog on Canary Analysis.
Benefits of Canary Deployment
Exposing a new deployment to a small number of users reduces the risk of a significant production error.
- Zero production downtime with faster Rollback:
After the smoke test, sanity test, and capacity test (with small production traffic) if a newly released software is not deemed fit, it is easily rolled back.
- In case of an error, the traffic is simply re-routed back to the baseline and the error is eliminated from the customer’s perspective. Engineers can then determine the root cause and correct it before re-introducing a new update. This means that things can be rolled back to normal quickly and easily at the slightest hint of a problem.
- Less Costly with Small Infra:
Because you run the canary deployment on a small subset of users, you need only a tiny amount of extra amount of infrastructure. Unlike the Blue-green strategy where a new product like infrastructure is provisioned to deploy infrastructure, canary requires a small infrastructure initially to deploy your changes and check/vet if your application is running fine of the end-user.
- Flexibility for businesses to experiment with new features:
Because the canary instance is tested with minimal impact on user experience and infrastructure of the overall organization, developers get the confidence to innovate.
- Ability to carry out A|B Testing: The canary creates an inherent problem where there arises a situation when the new smaller instance needs to be examined with the older version. Comparing both versions on any scale leads to an inaccurate assessment of the canary. To solve this problem, we further divide the canary into two equal parts and perform an AB test to determine the stability of the canary release. Read more on the analysis in our next blog here
- The canary instance can be flexibly accessed with increasing load (5%, 6%, 10%, 25%, 50%, 100%) to test the instance for production stability progressively.
Canary Works for All Deployment Sizes
Each canary deployment is typically a small delta and may take minutes to hours to complete. This is a very conducive environment for fast and frequent updates. The shorter deployment cycles benefit organizations by reducing time to market and giving customers more product value in reduced time.
Furthermore, the canary is very good for quick updates; it also works tremendously well for large and distributed systems. For example, in a highly distributed organization with clients across geographies, each region will have the flexibility to defer the update based on their risk assessment.
Constraints to Implement Canary Deployments
- It can be time-consuming and error-prone without automation:
Many companies today execute the analysis phase of canary deployments in a siloed and non-integrated fashion. A DevOps engineer is assigned to collect monitoring data and logs from the canary version and manually analyze them. The process is time-taking and not scalable for rapid deployments in CI/CD process. In a non-accurate analysis, wrong decisions might be taken to roll back or roll-forward a new release. You can read about OpsMx Autopilot which helps large enterprises to automate the analysis step and improve the speed and reliability of CI/CD processes.
- On-Premise/thick client applications are difficult to update:
In an environment where the application is installed on personal devices, it becomes challenging for a business to perform a canary deployment. One of the ways around can be setting up an auto-update environment for the end-users.
- Implementation can be tricky:
- Managing the different versions of the application with canary is smooth, but when it comes to managing the databases, we will need a new set of skills to overcome this. When we try to modify the application to interact with the database or change the database Schema, we end up with a very complex deployment process.
- To be able to perform the canary, we have to first change the schema of the database to support two or more instances of the application. This will allow the old and new versions of the application to run simultaneously.
- Once we have the new architecture of the database in place, the new version can be deployed and switched over.
The canary deployment strategy is widely used because it lowers the risk of moving changes into production while reducing the need for additional infrastructure. Organizations using canary can test the new release in a live production environment while not simultaneously exposing all users to the latest release.
Published at DZone with permission of Jyoti Sahoo. See the original article here.
Opinions expressed by DZone contributors are their own.