Recently, I helped facilitate some discussion workshops on the topic of clean code. Each of the discussions seemed to be predicated on a belief that readability is the most important criterion by which to assess whether code is clean. Indeed, the groups spent a lot of time discussing ways to establish and police coding standards and the like. While I agree that this can be useful, I felt that the discussions missed the aspects of clean code that I consider to be the most important.
So, I thought it might be useful here to attempt to describe what I mean by the term.
(Disclaimer: None of what you are about to read is novel thinking. All of it has been said before by numerous great programmers. But I do think it needs to be repeated often.)
Firstly, I don’t call it clean code. That term seems to me to encourage the view that cleanliness is something superficial, that it is all about how the code looks. Instead, I call it habitable code, as described by Richard Gabriel:
Habitability is the characteristic of source code that enables programmers, coders, bug-fixers, and people coming to the code later in its life to understand its construction and intentions and to change it comfortably and confidently.
Habitability makes a place liveable, like home. And this is what we want in software — that developers feel at home, can place their hands on any item without having to think deeply about where it is.
So, while I agree that readability is incredibly important, habitability goes deeper. To me, habitable code is any code that can change at the same pace as the business. Small changes to business requirements (i.e., within the scope of the current domain) should incur a correspondingly low cost of change from the code, whereas large swings in business requirements can expect to incur much larger development costs because the new requirements aren’t a natural fit with the current codebase or won’t be able to share much existing working code.
Given that definition, it seems to me that the most important aspect of habitable code relates to its overall structure. I believe that this kind of habitability is achieved only when the top-level design (architecture) of the application is readily apparent in the code. I think the best way to achieve that is as follows:
- The application is divided into a small number of modules.
- Each module represents a meaningful concept in the domain; they are named such that anyone familiar with the domain would understand their purpose.
- Each has a well-defined interface, again expressed in domain terminology.
- The lifecycle of each module and the relationships between the modules are expressed declaratively in the application’s entry point(s). That is, the application’s entry point(s) must state in clear declarative terms how these modules are plugged together to deliver the required business value.
- All connascence between the modules is obvious in the code at this level.
(If this feels like a re-statement of the four rules of simple design and/or structured programming, that’s no coincidence. No bad names and no unwanted coupling. I’m not inventing anything new here.)
For me these modules are a combination of form and function — their names partition the application domain and their behaviors describe the business logic as seen from 30,000 feet. The coupling between them must be declared, too; no hidden connascence.
I also believe that this definition applies recursively. Look inside any of the top-level modules and each should show the same characteristics as the overall application:
- Divided into a small number of submodules.
- Each submodule has a well-defined interface.
- The relationships and lifecycle of each of these submodules are expressed declaratively in the module’s entry point(s).
- All connascence between the submodules is obvious in the code.
These nested modules may be applications, microservices, packages, namespaces, bounded contexts, aggregates, modules, objects, functions — anything that might be considered an “encapsulation unit” or an “object.”
This sits well with what I have called the Page-Jones refactoring algorithm:
- Divide the application into a small number of modules.
- Remove all higher-order connascence between these modules.
- Recurse until bored.
The decomposition of any module into submodules represents a set of implementation choices. I want to be able to treat each module as a black box and to be able to refactor inside the bounds of that box without friction. This requires that the interface of this module be well-defined (via automated tests, for example) and that the connascence between it and its peers be obvious and understood.
It seems to me that the value gained by applying this algorithm decreases as we go down through the layers. That is, there is more value in having obvious relationships between the top-level modules than there is deeper down in the submodule hierarchy. The application will be easier to change if the outermost layer of code is expressed simply and in domain terms. Having low-level modules expressed cleanly and simply will have great benefit, but less overall benefit on the habitability of the application.
One way to express this might be to say that I am advocating message-oriented programming at all scales. I care about the partitioning into modules only to the extent that the communication between them must be simple and self-evident. The modules exist only in order to hide details of how one layer of messages and state change actually “works.” I guess it is also no coincidence that Michael Feathers’ Naked CRC technique is often a great way to communicate designs that are habitable in this way.
This is all a long-winded way of saying that I think of clean code as habitable code; in turn, habitability is about discoverability, and discoverability is best facilitated by writing the code as one would describe it using the Naked CRC technique.
Unfortunately, most of the codebases I encounter violate the above model in every way. They usually consist of a structureless soup of hundreds of classes, all existing at the same level of abstraction and accessibility. Such codebases are not habitable. It can often be so difficult to find code that it is easier to write a new class than to extend or generalize existing responsibilities. It can often be difficult to refactor because most of the “unit tests” are checking implementation choices rather than business requirements. It is usually difficult to see the overall design because there are no higher-level encapsulation units.