Waste #7: Defects

DZone 's Guide to

Waste #7: Defects

· Agile Zone ·
Free Resource
Welcome to the final episode of our series "The Seven Wastes of Software Development." In episode one, we introduced the concept of eliminating waste from our software development efforts. Waste elimination can be traced all the way back to the the mid-1900's, the birth of lean manufacturing, and the Toyota Production System (TPS). This is how Taiichi Ohno, the father of the TPS, described its essence:

All we are doing is looking at the time line, from the moment the customer gives us an order to the point when we collect the cash. And we are reducing the time line by reducing the non-value adding wastes. [1]

In this episode, I'd like to focus on "Defects." That defects are a source of software development waste should be self-evident, but exactly how do we measure the amount of waste created by a single defect?

Read the other parts in this series:

The Poppendieck's offer us a simple formula [1] which can help us quantify the amount of waste caused by a single defect:


What this tells us is that holding the impact of a single defect as constant, the amount of waste created by that defect increases with the amount of time that it lies undetected.

Consider a defect of massive impact - perhaps an algorithmic error that would over time cause the loss of millions of dollars within a financial system - that is detected by a continuous integration system within minutes of the defect being committed to the project's source code repository. The waste generated by this defect is minimal, as it can be corrected by a developer within minutes of detection.

Now consider a minor defect - perhaps an overly complex algorithm that causes certain business operations to take 1.5 times longer than necessary for users to complete. That 50% overhead quantifies real hours on the clock multiplied by real hourly wages that are lost to your company because of this defect. Let's suppose that this defect lies undetected by anyone for three weeks. Do you see that the waste created by this defect is quite significant?

As developers, we don't really get to choose the impact of the defects that we write. None of us comes into the office and says "I'm only going to write low impact defects today." What developer in his right mind would choose to write a high impact defect rather than a low impact defect? I say all this to emphasize that the variable we must seek to minimize is the time to defect detection. In fact, our ultimate goal is to detect all defects immediately after their introduction!

So what are some ways that we can detect defects more quickly?

  • Our first line of defense is our regression test suite, composed of automated tests written at the unit, integration, and functional acceptance level. These suites are made even more effective through the use of Test-Driven Development for units and Acceptance Test-Driven Development for features.
  • Assuming that we have a regression test suite that covers a very high (>80%) percentage of our code base, and that suite covers all of the conditions we anticipate our code will realistically encounter, our second line of defense is the judicious use of exploratory testing. Exploratory testing by nature must be conducted by seasoned test professionals that can enter the minds of our users and exercise things in ways that we as developers do not anticipate. For all of us know that the one thing that we insist our users will never try is the first thing that they try following any production release.
  • Also incredibly important to our defect detection efforts are "antibody tests." I wrote about antibody tests in an earlier Agile Zone article entitled "You Are Your Software's Immune System." When confronted with a defect in your system, your first task is not to eliminate the defect. Your first task is to write a test that will fail in the presence of that defect. Now your duty is to make that test pass. As soon as another hapless developer reintroduces the defect, it will immediately be detected by the antibody test. Just as antibodies enable our immune systems to more quickly react to and defeat invading antigens, antibody tests enable us to more quickly react to and eliminate bugs.
  • Another weapon in our arsenal is the combinatorial testing tool. Our code can literally face an infinite number of possible inputs, and it is humanly impossible for us to write tests to cover all of the possibilities. Combinatorial testing tools can exercise our functions with incredibly large numbers of possible inputs and then report on their behavior. These tools will give us greater confidence that our code can stand up to even the most extreme of conditions defect free.
  • A great many areas of applications, often referred as the "-ilities," are simply difficult to test. Think of scalability, usability, and securitility...err um, security. Excellent testing tools and methods exist to aid us in determining how well our applications stand up in any of these areas. Use them, and use them early in development, especially if any of these areas is of primary concern for your application.
  • Wrapped around all of these types of testing should be a robust continuous integration system that will execute the entire test suite immediately upon each and every code commit operation. Continuous integration is the only surefire method of detecting defects immediately.
  • Another consideration is the similarity of your continuous integration environment to your production environment. Any and all differences between this environment and production can and do often matter. In their excellent book Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation, Jez Humble and David Farley relate the story of a production deployment that was hampered by only a few bytes difference in a dependent shared library. Those few bytes caused an error condition which proved impossible to replicate in any other environment, bytes that were only detected later after an exhaustive audit of all possible divergence between the production and verification environments. The closer you can replicate your production environment in your continuous integration system, the earlier you'll be able to detect and squash the most difficult defects to eliminate.

These are by no means the only methods of speeding up defect detection, but they provide an excellent starting point on the road to success.

This concludes our series on "The Seven Wastes of Software Development." I hope that you've found this series helpful and that it has provided you with some action items to bring back to your own software development efforts. Why not share your plans with the rest of our readership in the comments section? Now hurry up and take out the garbage!


[1] Ohno, Taiichi. Toyota Production System: Beyond Large Scale Production. Productivity Press, 1988.


Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}