Bad deployments are a resource-draining problem for every development team. Without the help of software deployment tools, spotting a bad deployment is like finding a needle in a haystack. Almost 1/3 of software businesses count on their end users to report these errors. In the last company I worked for, we’d deploy, wait a day or two, and assume it was all fine because we didn’t hear many complaints.
What we didn’t factor in was that only 1% of customers report software errors, and any reports were usually vague and never gave diagnostic details. But we didn’t know a better way.
In truth, our development pipeline could have been better automated to avoid bad deployments and detect errors – without relying on our customers to tell us.
First, What Is a "Bad” Deployment?
Warning signs of a deployment that has caused problems:
- Increased P99 load time.
- Customers taking to social media.
- Higher error rates.
- Shorter customer sessions.
I’m not saying wait until your deployment is perfect. Imperfection is part of the continuous delivery process. In our development team, we say, “Move fast and break things – just fix them quickly.”
You may be thinking that just one bad deployment can’t have that much of an effect on business. An example of a terrible deployment happened on Wall Street just a few years ago. The Knights Capital Group lost $400 million in assets and went bankrupt in just 45 minutes because of a single failed deployment. It sounds too bad to be true, but this particular deployment caused complete havoc:
“During the deployment of the new code, one of Knight’s technicians did not copy the new code to one of the eight SMARS computer servers. Knight did not have a second technician review this deployment and no one at Knight realized that the Power Peg code had not been removed from the eighth server, nor the new RLP code added. Knight had no written procedures that required such a review.” -SEC Filing | Release No. 70694 | October 16, 2013.
In a post deployment analysis, DevOps expert Doug Seven advises two principles that could have prevented the collapse of Knights Capital:
- Releasing software should be a repeatable, reliable process.
- Automate as much as is reasonable.
It’s common to believe that disparate tools are expensive and unnecessary. However, that cost is negligible compared to the cost of not integrating tools to aid the deployment process. Below I’ve highlighted six essential software deployment tools that will make your deployment pipeline both more reliable and repeatable, with a few examples.
Continuous Integration (CI) Tools: Always Be Testing
As software companies grow, so do their code bases. CI tools like TeamCity, Jenkins, and Visual Studio build your product after every change made by a developer ensuring the code base can produce a shippable product at all times.
While CI tools are great for building your product, they can also run unit tests for you. Code bases continually change, so having a system validate your products as they change is a good line of defense.
Automated Deployment Software: Roll Back if Needed
Manual deployments are error prone due to human error. Automating the deployment process with Octopus Deploy and AWS CodeDeploy makes it easier for anyone on the team to deploy new features safely multiple times a day (we deploy up to six times a day, for example).
Registered a bad deployment? No problem. With only a few clicks you can quickly roll out a previously healthy deployment without developer intervention.
Crash Reporting: Detect Bad Deployments Early and Avoid Vague Error Messages
Software crashes and performance problems caused by poor deployments are elusive. Inaccessible log files bury important information behind vague error reports like “Deployment failed.” The truth is, the more information surfaced about problems, the better. Enter Crash Reporting.
Crash reporting software closes the feedback loop between developer and customer by providing details like the stack trace. For example, Raygun features deployment tracking which will alert you to unusual error spikes after a deployment and even tell you which version introduced issues. Fixing errors becomes much faster.
In Knight’s case, error reports highlighting the bad deployment landed in employee’s inboxes but weren’t marked as “critical,” meaning the emails went unopened. Crash and error reporting distils error reports into just one digest.
While crash reporting is great for critical problems, it doesn’t tell the complete story of other non-critical problems that frustrate customers. What can you do to tell a better story of a customer’s experience?
Analytics Software: Understand the Bigger Picture
Analytics help you to visualize high-level problems so developers can work towards low-level solutions. The data that analytics surfaces can not only tell you the hard numbers like purchases made but also the harder to measure metrics like user retention. Analytics software, like Real User Monitoring, will show you exactly where performance problems lie and what you can do to fix them.
Using analytics in your product will help over time to paint the bigger picture if you’ve made a bad deployment or not. Good analytics tools will allow you to become even more granular – even down to the affected user.
Application Performance Management (APM) Tools: Know Your Environment
You’ve just deployed a new feature that brought in a lot of new customers. That’s great! But can your infrastructure handle the volume of new customers, all navigating through your software?
If your infrastructure wasn’t ready, you might get a few customers taking to social media asking about slow load times on your website. Raising the issue with the development team could result in the answer: “I don’t know, it’s a problem with the servers… I think?”
APM tools help to shed light on the black boxes that are your servers. Sometimes the problem is not a bug in the code but the environment in which it lives. APM tools give you insights into if it was the deployment that was bad or just the nature of the environment in which it lives.
ChatOps Tools: Get Notified Faster
ChatOps tools like Slack and Hipchat are great for internal team communication, but they have grown well beyond that. They can now integrate with your other software deployment tools for better communication.
Want to know when your servers are under heavy load? You no longer need to stare at ever changing numbers in the hopes of spotting a problem. Setup conditions and integrate your ChatOps tool with your analytics software. Your analytics or APM tools tell you automatically when something needs your attention. ChatOps tools alert you automatically to problems.
Software Deployment Tools Make Your Life Easier
It’s all about having your software tell you what’s wrong, so your customers don’t have to. As Doug Seven explains, “It’s not enough to build great software and test it; you also have to ensure it is delivered to market correctly so that your customers get the value you are delivering (and so you don’t bankrupt your company).”