Quality is the Answer
Quality is the Answer
Quality is like security: it needs to be baked into every step of your process and is a key ingredient to success.
Join the DZone community and get the full member experience.Join For Free
Everyone knows that quality is a key component in software development, but there’s an equal and opposite feeling that there just isn’t time to do the kind of analysis and testing it requires. We prioritize feature development over backfilling tests—or writing them in the first place—and our agile ceremonies lack the rigor and process quality requires. Compounding the issue, many people are experiencing the limitations of traditional unit tests and are tempted to abandon what little automated testing practice they have in place.
But quality is the only way you can meet your deadlines. Quality is the key to predictability and success. Pretty much no matter what the question, quality is the answer.
Yes, it’s a wild claim.
Quality is an investment, and it can be painful and immediately disruptive to pull valuable engineering time from feature development to address quality concerns. However, just like a financial investment, compounded interest dictates that the sooner you participate, the bigger your returns.
What is Quality?
While part of quality is reducing the number of times our customers encounter the sad frowny face of a broken web page, it’s bigger than that. Quality is a practice and a mindset that spans the full software development life cycle and goes beyond the boundaries of the code in your repositories. Quality is:
- Vetting requirements with the business for completeness, accuracy, marketability, and compliance. It is having a process and methodology for requirements gathering, review, and sign off and a core group of product owners and subject matter experts to help steer the conversation and arrive at the correct answers—before coding even starts.
- Enforcing standards in design, documentation, and code to ensure the continuity of your business. Developers come and go, and onboarding new people and making them productive comes at an enormous cost—not just for the new hires themselves but for the experienced developers who need to train them.
- Effective automated testing to confirm requirements are implemented as expected and to allow you to release with confidence. More on testing later, but it is a key activity.
- Full root-cause analysis and regular retrospectives with actionable outcomes. Retrospectives, or post-mortems, are not just a place to air grievances or pat ourselves on the back; the results of these conversations need to be tangible items that teams need to start, stop, or continue doing to improve outcomes across the board. Each quality issue must be tracked to its root cause so it can be addressed proactively and stop problems before they even start.
In ways, quality is a lot like security: if you try to address it in only one way or in one part of your process, it will bite you in the end. And it will cost you dearly.
The Cost of Poor Quality
In 2002, NIST published a study that examined the cost of bug remediation across the different phases in a typical waterfall SDLC. They concluded that fixing a bug in production was 30 times more costly to fix than one found during the requirements phase. Even a bug in UAT was 15 times more costly, and that’s before the bug even makes it out the door.
IBM followed NIST’s study up with one of its own nearly a decade later in 2010—after Agile had started to take root in the industry—that upped the highest cost number to 100x. Search the web for “cost of poor-quality software,” and you’ll see numbers in the TRILLIONS thrown around when the industry is taken as a whole. These are costs not only of bug remediation but lost productivity, support staff, missed sales, and customer flight, to mention just a few.
Clearly, none of this information is new, but the situation gets worse all the time. Feature development regularly goes off the rails because resources are leaked to provide production support and break-fix releases. Even if a project manages to retain its full staff for the duration, incremental development is stymied by poor-quality code, where a change over here can cause a cascading catastrophe over there.
In the world of cloud-native and distributed systems, the cost comes not just in fixing a bug, but in finding it in the first place. Logs are copious, transient “black cats” abound, and debugging highly distributed solutions can be a nightmare. Software has gotten more complex and less comprehensible, escalating frustration and slowing down developer performance.
Given all of this, the $64,000 question is: if everyone knows how expensive bugs are to address, why don’t we prioritize fixing them early in the process when they’re relatively cheap and harmless? The answer is that we often try ... but not in the right way and not hard enough.
The Failure of Unit Tests
Quality comes from test-driven development (TDD) and unit tests, right? Well, no, not really. Developers started writing automated unit tests in earnest in the late 90’s and early 2000’s. With decently factored code, unit tests are easy to write, fast to run, and are somewhat useless, potentially even counter-productive. (For a treatise on the subject, refer to the oft-cited article http://rbcs-us.com/documents/Why-Most-Unit-Testing-is-Waste.pdf by James O Coplien.) In a distributed system, a commonly accepted metric is that unit tests might catch only 20% of your bugs; the rest are introduced through the integration of several services together to produce the business use case or workflow that is the customer-facing feature. Unit tests are fantastic for verifying code that implements functionality like complicated single-process calculations, complex validation, or file parsing and report generation, but how much of your platform is actually concerned with these things?
A unit test can’t confirm the serialization of data across the wire or into a database. It can’t uncover an issue when retry logic locks up your platform due to a cloud resource failure. It can’t catch that a service is expecting to be called by another service in a specific order or exercise eventual consistency or error compensation logic. If there is a bug in the requirements, no automated tests will uncover the issue. Even worse, unit tests are often highly coupled to the implementation of specific parts of the code instead of just to overall code behavior, so changes to the implementation, which are inevitable, can cause an avalanche of required changes in related unit tests.
A whole world of process and functionality exists outside the unit. A platform with a large percentage of code coverage and a suite of hundreds or thousands of unit tests can still host many tenacious bugs or be dangerous or costly to change. Integration tests, end-to-end tests, smoke tests, stress tests, performance tests, and chaos testing all likely give us better return on our investment of engineering time but are often deprioritized because they are more difficult to write and require corresponding automation of related resources and infrastructure instead of just running pure code.
In addition, since bugs are most cost-effective to find when they are caught in requirements (before the behavior is codified in code and corresponding tests), looking to unit tests—or any kind of automated testing—to save us is short-sighted. Technology must partner with the business better to reduce friction, inaccuracies, and incompleteness in the requirements gathering phase of each feature.
Quality is a mindset that requires a commitment from everyone and needs to be addressed at every stage of your process:
- Requirements gathering and validation
- Implementation, including
- Proper design and architecture
- Coding standards and patterns
- Structured code reviews and gated pull requests
- Wide testing approach, favoring integration testing
- Infrastructure automation
- Internal manual testing before UAT
- UAT testing and bug triage
- Production release processes
- Retrospectives and the resultant feedback loop
This requires both a quality champion in the company and buy-in from all participants in the development lifecycle. That said, buy-in can be hard to win. To realize small wins you could leverage to induce more participation in the process, you might be tempted to pick a known quality problem and address it head-on, but quality is not a game of Whack-A-Mole. Before you embark on making changes to your process, you need a plan that starts with a vision of the future. This “to be” nirvana is what enables the creation of an incremental road map of progress.
Once the desired end state is articulated, the path to quality should always start by first addressing your points of greatest pain. These are different for every company, but some common ones are:
- After every major release, we have weeks of hot fixes we need to deploy.
- Our current support staff is overwhelmed, and we can’t hire any more people.
- Our review meetings each sprint always uncover changes or additions in requirements.
- It takes three months to get any new developer productive in the system.
- We’re afraid to lose Jane Smith, our Sr. Engineer, because only she knows how X works.
- We have high developer turnover because no one wants to support product Y.
- Our releases are consistently three weeks late because QA and UAT take longer than planned.
- We can’t provide accurate estimates for development of new features because we lose resources to production support requests.
And so on. Pick the worst pain, analyze it to uncover the root cause, and work to address that. Resolving root causes always provides the greatest return on your effort investment.
No matter where you start, you’ll begin to see results within a single iteration. Some problems might be low-hanging fruit with easy answers, like making sure the right people are in the room when discussing requirements or focusing new integration tests on a particularly buggy part of your platform, but other problems are more tenacious and require larger, team efforts to eradicate.
As quality processes start to gain a foothold in each step in your SDLC, everything will become more predictable: resource allocation, release cadence, support loading, marketing plans, even estimates. With predictability comes greater transparency, a closer partnership with the business, an ability to meet deadlines, and continued business success.
Quality is not a bolt-on after-market feature to a proper development process; it is a key ingredient and the core answer to the question of how to improve.
Opinions expressed by DZone contributors are their own.