Like over two thousand others at DZone, I recently read Grzegorz Ziemoński's article on Test Driven Development (TDD). I always enjoy Grzegorz's articles and appreciate his bold willingness to state opinions. I especially respect any author who is so open about his experiences.
I'm also a fan of TDD, or at least what the JUnit folks call being Test Infected (admittedly not the best name). So, what I'm offering here is a counterpoint as opposed to a disagreement. Like I've written before, I think it is often necessary to hold on to two ideas that are in tension. In the case of TDD, in my view, there is a serious benefit to building tests in parallel with writing the software. However, any attempt to argue that it is a holistic or complete approach to programming is bound to fail.
YAGNI and YAGBATTAC
One of the stronger arguments for TDD is that it helps keep the code clean, both in the sense of being designed for testability (which generally means well designed) and in the sense of not including anything that is not necessary to meet the requirements. Another way to say this is You Ain't Gonna Need It (or YAGNI). If it isn't immediately necessary to meet requirements, don't put it in as a way of future-proofing the code, because the future is uncertain and what you really need to add to the code might not fit into your plan.
But I think there's a counterpoint to YAGNI that is often invisible, which I will call You Ain't Gonna Be Able To Test All Cases (or YAGBATTAC, pronounced "yag-ba-tack"). For an example, take the Roman numeral code from my article on table-driven tests in Go. I have 41 separate test cases in that code, but obviously, it is way short of all of the possible inputs that the function should handle correctly.
And it does no good to claim, "well, I tested the most important cases" or "I tested the edge cases." In order to know what the "most important" or "edge" cases are for testing, we have to bring in extra knowledge from outside the process of TDD. We have to make decisions about what the requirements mean and those have to come from somewhere. That means we have an external standard which is the real source of our knowledge about when the code is "done." As a result, it is no longer enough to say, "when the code passes all the test cases, it is complete." Now we have to say, "when the code passes all the test cases, and the test cases represent all of the functionality required by the code, then the code is complete." That is a very different statement and one that is much more subject to our engineering judgment.
Advocates for TDD understand this. On the XP page on test-first, it says to continue until there is "nothing left to test." Kent Beck points out that we get paid to code, not test. Where I think things go astray is where TDD itself is treated as the source of the decision as to whether there are more tests to write, out of some idea that it is obvious when the system has "enough" or "complete" functionality. For simple cases like Roman numerals and squares, it might be possible to agree on "complete" functionality, but for real systems, it is not so easy.
Magic Is Going to Happen
Similarly, a critical step in TDD (and one that TDD advocates claim is the difference between success and failure) is refactoring. The idea is that we write a failing test first, then make it pass, then refactor to remove duplication.
Now, it is immediately obvious that while we have a clear measure of sufficiency for writing the code (when the new test passes, we are done) it is just as clear that there is no such rule for refactoring. How much is enough? We talk about things like removing duplication or having only one return from a method, but we know that these are subjective rules of thumb that have to be broken sometimes. It seems that we are left again with our engineering judgment. This is an example of the necessary tension I mentioned above: we need to combine a willingness to go far in improving the quality of our code with YAGNI. Knowing when to stop means knowing how to balance those two seemingly contradictory ideas.
And that takes us to a deeper problem with using TDD as a complete methodology. TDD examples like this one with Roman numerals are full of statements like "it's pretty obvious," "we have discovered a rule based on similarities," and "looks OK to me." You might say that these are "design smells" where some kind of design activity is going on in the mind of the programmer in a way that approaches jumping ahead to the solution in a leap of insight. "Seeing" a generalization like this (or the one that enables Grzegorz to go from
square*square) is a pure act of human creation, what I call a "Magic Happens Here" step that cannot be reduced to a process or further decomposed into smaller subtasks.
TDD and CMM
And that is the way in which TDD, as it is often described, reminds me of traditional software processes like the Capability Maturity Model (CMM) — to the extent that either becomes a rote list of steps to follow that promise to remove the need for human creativity and human aesthetics about what constitutes "good" design, "good" architecture, or "good" code, they ultimately get in the way of building quality software rather than enabling it, and to the extent that TDD and any other "process" or "practice" for making software incorporates the fact that engineering is a creative activity, and that the process exists to serve and enable that creativity, they are useful.
As I said above, I approach this topic as a fan of TDD (and code review, and static analysis, and other "practices" to ensuring code quality). But there are times when I feel our industry looks too hard for some silver bullet that will take the uniquely human "craft" out of writing software. I would much prefer that we just admitted that much of what we do is more craft than science and spent our time learning to be better craftsmen.