Monitoring Is Testing
Monitoring Is Testing
A preview of DZone's upcoming research Guide to Code Quality and Software Agility
Join the DZone community and get the full member experience.Join For Free
Software quality has traditionally focused only on the number of defects in a system. Testing was the main technique for decreasing those defects. Today, we know that quality is not something you can test into a product—it has to be part of the product from the start. And while testing can still be used to ensure quality, there are other strategies we need to consider. The definition of quality is not the same for every organization, because the type of product being developed changes with an organization’s needs and priorities.
Testing Depends on the Deployment Model
First, let’s consider boxed software: software that is packaged and sold in a “box” (virtual or physical). This type of software is difficult to update and feedback from users is traditionally scarce. Since updates take a long time and are typically expensive, extensive testing is needed before releasing boxed software. Because of cost, automated testing rather than manual testing would seem to be the preferred choice, but a balance of methods is necessary to effectively to monitor these programs.
A major problem with automated tests is that they are typically an ineffective method for identifying new bugs unless they are designed for that purpose. Unit and functional tests should be stable, consistent, and reproduce the same result every time. This results in an unfortunate blind spot with regards to new bugs, as these tests are only able to identify regressions—something that programming teams often forget.
In my experience, focusing 100% on stable unit and functional tests, without any other test component, will not give you the software quality you want. However, integration of other techniques can help fill the gaps. Manual testing finds new bugs and so do tools that are designed to find new tests. Fuzz testing, where inputs are mutated randomly, finds one category of bugs. There are also variants of fuzz testing tools that are smarter than the basic random-input versions. These will analyze the code tofigure out what values to use. IntelliTest in Visual Studio 2015, formerly known as Pex, is such a tool.
If boxed software is at one end of the software model spectrum, services are at the other end, in particular, services in a cloud environment. Cloud services offer the potential to have multiple versions of your service running at the same time during deployments. In a service, the need for thorough up-front testing is not as important as it is in boxed software. Instead, the ability to detect defects in live applications is far more important. In fact, it’s more effective to focus your software refinement efforts on detection in a services environment, because there is always the option to roll back to the last stable version at the push of a button.
Clearly, a different set of tools is required to ensure quality in services than what is required for boxed software. For example, in a service you want the ability to try a new versionon only a single machine or even on a small subset of users before the new version is released to all users. You also need the ability to roll back to an older version and limit user impact if something goes wrong.
However, this does not mean that you stop all up-front testing for services. It only means that you might focus some (if not the majority) of your efforts on using different tools and techniques from boxed software testing tools. You also want to run tests continuously in production (TIP-testing). That is, you essentially want to run a certain set of tests all the time to make sure your service is healthy.
All Services Need Instrumentation and Monitoring
How do you know your service is healthy and behaves the way it should? The answer is instrumentation and monitoring. Your application needs to have the proper instrumentation exposing health properties, such as requests per second, request completion time, failure rate, etc. This instrumentation, in turn, requires monitoring and maintenance, meaning everything from notifications that cause somebody to be woken up in the middle of the night, to dashboards with pretty graphs, and automatic actions such as automatically scaling by adding and removing instances to a cloud service.
Good monitoring is relative. For example, using the absolute number of a certain failure per second means that depending on how popular the service is, the monitor is more or less likely to trip. I’ve experienced this many times: an alert happens that has never happened before and when the problem is investigated, it turns out that the threshold for the alert is some absolute number that represented a failure rate of a few percent a few years ago, but with recent load, the absolute number only represents a fraction of a percent.
My advice is to use relative monitors instead. When working on a service, failures (or other anomalies) should almost always be compared to the total amount of requests to your service. With relative monitors, the system triggers alerts based on percentages of that total load rather than absolute values. The only real exception to this rule is latency monitoring, since latency typically requires a different approach. For example, many shops monitor when the 95th percentile reaches a value higher than some absolute value. So if the slowest 5% of requests take longer than 200ms then you want to act. While the threshold here is an absolute value, the use of percentiles still gives you a relative property that you want in your arsenal of statistics to monitor. You want your monitors to trigger on real problems and not have too many false alarms because we all know what happens to somebody who cries wolf all the time –they get ignored!
While I recommend gravitating towards instrumentation and monitoring for services, this should be balanced with testing for optimum results. As for boxed software, a fair amount of instrumentation is necessary for an awareness of what problems the users have and how your product is being used. However, because the cost of updating boxed software is so high, you have to hedge your bets with more testing.
Hybrid Software Models
There’s a good deal of software that doesn’t fall solely under the services or boxed categories. For example, there are a lot of apps being developed today—small applications typically installed on a phone or tablet. Apps are interesting to consider as a hybrid model. They are very close to services in how they behave, but they are like boxed software because new versions need to be downloaded and installed by the user. Apps are typically easy to install, but there is no guarantee that they will be updated. Also, different platforms (Android, iOS, Windows) take different lengths of time to review and deploy updates, so even within this category, you need to consider how much testing is needed versus relying on the ability to provide quick updates for your app. Ultimately, because of the similarities between apps and services, instrumentation of apps’ behavior is very important in order to create a high quality app.
Back to Defining Software Quality
There are several variables that affect how your organization should define software quality. User base size is a large part of the equation. If your software has a single user, you probably want less up-front testing than if you have millions of users.
Another consideration is the cost of a defect. If your software deals with trading stock on behalf of other users, an outage of just a few seconds could cost you a lot of money even withjust a few users. So, in this case, a little more up-front testingis probably necessary, even though your software is a service. On the other hand, a service that provides daily stock quotes to millions of users probably has little need for significant upfront testing.
Life-critical software, like the code found in medical devices, is another example of code that needs more up-front testing, both because of the rules and regulations around their reliability, and because of the significant human cost of any defect in that software. If you’re not working with embedded software, most modern boxed software can now be updated relatively quickly, so here are some basic principles you can follow to create high quality software in both boxed and service software settings.
Make sure you have the capability to update your software quickly.
Make sure you know how your software behaves in the hands of your users with instrumentation and monitoring, preferably through limited release of your product to only afraction of your total user base.
Use automated tests to protect against regressions.
All in all, it comes down to observing how your software behaves in real life rather than in an artificial environment. That is how you achieve software quality: by measuring user impact, and not just preventing bugs, but responding quickly once you find them.
For best practices on writing, testing, and monitoring quality code, get your free copy of the DZone Guide to Code Quality and Software Agility!
Opinions expressed by DZone contributors are their own.