Serpents and Sunbursts in Source Code Structure
Hardware engineers have it so easy.
Or at least they used to. Reviews went something like this.
Gary's designed a new board (OK, this was the late 80s when humans actually designed entire boards). He's beaten his CAD package half to death with its own, unopened manual but all important requirements seem fulfilled. He's even crowbarred a 16-bit A-D converter into the corner. That'll be oiled lightning.
So he calls a meeting with his immediate team and Janice, the lead architect. The following Tuesday morning they gather with cheery coffees. A casual affair. Relaxed. This will be the first of many reviews, after all.
As they take their seats Gary splays a sprawling sheet of paper on the table before them. Mazes of lines connect the resistors, diodes, amps, EPROMS and arrays of other microprocessing paraphernalia in wildly complicated orchestrations.
The team leans in, digesting the vastness. The room's fan whirs. Coffee slurps.
Janice then shifts in her chair, squints a little and says, "Oh." She taps her finger on that A-D converter in the corner. "A little overkill, no?"
A wondrous thing has just happened.
Of course a full appreciation of the design will require deep scrutiny of the microcircuitry logic cascades, component manufacturing tolerances, environmental performance spreadsheets and all that electrical jazz, and experts will hold further, increasingly-detailed reviews long before the board sees the business-end of a soldering iron but at that first meeting, with just a few guys'n'gals eyeing a sheet of paper, a minor miracle has taken place.
These people have inferred from mere ink splashes the workings of a pulsing, processing product. They have from abstract pattern evaluated the concrete.
Daily immersion in similar such cognitive shenanigans should not blind us to its astoundingness. It's water-to-wine stuff. But how similar is the software experience?
Do we have any visual patterns - as opposed to GoF patterns - from which we can glean bottom-line, commercial aspects of software? Can a graphical representation of source code structure inform its design?
Let's take a look.
A drive-by download.
Let's download a jar file at random from SourceForge - we'll call it Program X - and plunge into its source code structure. That's its source code structure, mind: its component elements and their inter-relationships. We are not interested in what the program does or how well it does it. In other words, we are not interested in its user-experience realisation.
Figure 1 shows a function-level spoiklin diagram of a typical class in Program X. (We'll use the term, "Function," and, "Method," interchangeably.)
Figure 1: Class Criteria.
Figure 1 shows us the methods of Program X's
Criteria class (a method being represented by a circle). We see many methods with no lines: these methods lack dependencies on other methods within the class. As many of these methods seem to be getters, this perhaps does not surprise.
put() methods all have dependencies, on either
remove() method and
Criteria() constructor also sport some dependencies.
The question is: does this class have a good method structure?
To answer this question we need a principle which is defined in terms represented by the diagram and we need a means of evaluating whether those representations adhere to that principle. For this review, we'll chose just one of the Tulegatan principles, the principle of impact set. Yes, distortions abound when viewing any program through the lens of a single principle, but time allows us only to hint at possibility - that we can read business-relevant information from abstraction - rather than convince with overwhelming evidence.
The principle of impact set states that the more long, transitive dependencies you have on an element, the higher the probability that a change to that element will ripple to others, with circular dependencies counting as the ultimate in long, transitive dependencies.
So, given that this principle is written in terms of the elements shown by figure 1 - in this case, methods - and that we have a means of evaluating adherence to the principle - by asking, "How many long, transitive or circular dependencies are there?" - we seem able to evaluate this source code structure.
It would appear that, based on the principle of impact set alone, figure 1 is well-structured. (Either that or it offers too little structure to be judged; we certainly cannot say that it is badly-structured.) A change to any method will probably not cause a large change to any other. Changes to this class should be cheap. No, we cannot say this with certainty, but from the vantage point of our single principle our money's looking safe.
Let's plough on. Figure 2 shows class
Figure 2: Class Index.
Figure 2 seems, again from an impact set point of view, similar to figure 1. There are no long, transitive dependencies within class
Index to act as bearers for ripple-effect changes. So, again, the structure can't be too bad.
Figure 3: Class Store.
There's something primordial, isn't there? Figure 3 wakes an ancient revulsion in programmers. Less scrupulous principles might, on this first glimpse alone, drag the
Store class out back for a severe slappin'. The principle of impact set, however, stays cool, having only one criterion on its mind: are there long transitive or circular dependencies?
We can best answer this by viewing the circular dependencies flowing through just that central method,
getStoreInfo(), showing methods involved in no circular dependencies as black and those contributing to circular dependencies as highlighted in blue, see figure 4.
Figure 4: Circular dependencies through getStoreInfo()
Half of all the class's methods hang like twitching flies on a sticky web of circular dependencies. It is far from obvious how changing any of these methods might affect any other: the very characteristic of engorged impact set.
Nor are even the non-circular dependencies short, which might have generated some sympathy.
If we tease apart the strands we find several long dependencies snaking through the undergrowth, such as that shown in figure 5, where
getParameters(), etc. Here is the infamous, "Serpent," pattern with its long chain of methods defined in terms of one other. Not as garish as the circular dependency, the serpent smothers subtly, increasing impact set without raising alarm until that dreaded Friday afternoon when a small change suddenly radiates out into dozens of others.
Figure 5: A serpent in the Store.
Perfectly good design decisions may lie behind the structures of figures 4 and 5. But for this review they do not matter. We must conclude that the method structure of this class is poor. Design decisions can justify poor structure but cannot hide it. Poor structure is potential change cost no matter how many reviews it survives.
Still, one class does not an application make. So let's go up a level and look at some Program X class-level spoiklin diagrams.
Figure 6: Package servlet.
Most of Program X's packages hold few classes. Figure 6, for example, shows the
servlet package of 7 classes. Well-structured, this package presents uncomplicated dependencies rendering almost trivial the task of tracking potential ripple-effect changes. Excellent.
Indeed this package offers a rudimentary example of the, "Sunburst," pattern whereby a single class - in this case,
RequestHelper - uses the functionality of several subordinates while those subordinates themselves remain relatively independent.
From the point of view of impact set, this is the virtuous opposite of the serpent in that the sunburst minimises the potential cost of ripple-effect changes among a group of classes. Try it for yourself. Take a random element from the serpent and sunburst above and count how many elements transitively depend on it.
Let's look at another package, package
Figure 7: Package data.
Figure 7 shows another well-sized package though with a little more meat on its bones than the
servlet package. The heavy lifter of the figure 7 seems to be the
AbstractDatabase class which again enjoys the restricted impact set afforded by its sunburst pattern. So, again, we happily judge this a well-structured set of classes.
We could, of course, go on, eventually cramponning our way up to the package-level view showing the entire application in a single spoiklin diagram but the point has perhaps been made.
Can we infer business-relevant information from abstract structure patterns? Hell, yes.
Program-comprehension is not text-comprehension.
The structure of a program, sanitized of run-time context and rendered visually, can provide business-relevant information difficult or impossible to glean by other means.
Those hardware engineers of our anecdote were on to something.
And that 16-bit A-D converter? Didn't make it. Gary was devastated.
Software design grinds to the crunching of competing principles. This review has focused on just one, that of impact set, yet even that single principle yielded examples of patterns to be mimicked - the glorious sunburst pattern - and patterns to be shunned - the tongue-flicking serpent.
Source code structure is not about academic gymnastics or programmer hubris. It's about potential cost of software development. Good code structure is as optional as profit.
There is gold in the structural should you only chose to mine.