Over a million developers have joined DZone.

The Golden Age of Spaghetti Code

According to Edmund Kirwan, the greatest trick spaghetti code ever pulled was convincing the world that it didn't exist.

· Agile Zone

Reduce testing time & get feedback faster through automation. Read the Benefits of Parallel Testing, brought to you in partnership with Sauce Labs.

Which of following two Java package structures is the least well-designed?

Image title

Figure 1: Spaghetti structure (sources here and here).

Oops! Let's try that again.

Image title

Figure 2: Two Java package structures, JUnit and Spoiklin Soice.

This blog's banged on and on about how much the package structure on the right in Figure 2 (in which circles are packages, straight lines represent down-the-page dependencies, and curved lines are up-the-page) is better than that on the left because the right's structure presents clearer dependencies, making update costs easier to predict and to update themselves often easier to implement.

Two problems, however, persist.

Firstly, graphical evaluation is subjective. Yes, most would agree that the structure on the right is better, but consider Figure 3.

Image title

Figure 3: Two more messy Java package structures, Lucene and Struts.

Which do you think the messier structure in Figure 3? (The answer is in Table 2.)

Secondly, diagrams such as these offer insight when we evaluate a small number of nodes, as on package-level, but fail before the ghastly node-pocalypse of class- and method-level.

Image title

Figure 4: A Java method-level structure. Good luck with that.

What we need is to objectively quantify spaghettiness: its messiness, its disorder. How on earth do we do that? What makes a structure messy?

Fortunately, mathematics has already defined what spaghettiness is by defining its opposite: total order. And we can apply this to computer programs, with just one teensy supposition.

Mathematics says that if a set has a binary relation with just three specific properties, then that set enjoys total ordering. Let's go through it.

Consider the three methods in Figure 5, where method a() calls b() and b() calls c(), forming the single transitive dependency: abc.

Image title

Figure 5: Three simple methods.

From this diagram, we must extract a set of numbers. We'll extract our old friend depth, where depth is method's position in the transitive dependency. Thus, a() is at position 0 (because programmers), b() is at position 1, and c() is at 2.

Image title

Figure 6: Three simple methods numbered by their depth in a transitive dependency.

Mathematics tells us that this "program" is totally ordered with respect to depth, if, when you extract these depth values and iterate over them in pairs – a pair of depth values being, say, d1 and d2 – then the following properties hold:

  • If d1 >= d2, and d2 >= d1, then d1 == d2. (Duh.)

If d1 d2, and d2 d3, then d1 d3. We'll come back to this.It is always true that either d1 d2 or d2 d1.

The first and third properties are rather trivial, but that second property says that if we write out our transitive dependencies then depth values should never decrease. And in Figure 6, they do not: a(0)b(1)c(2).

As our program's depth satisfies all these properties, then it is totally ordered. We have achieved mathematical objectivity.

Figure 7 shows a slightly more complicated program of two transitive dependencies, again with methods' depths indicated.

Image title

Figure 7: Oooooh! Two transitive dependencies.

Both transitive dependencies separately satisfy the three properties required.

Now let's look at a bad boy. Suppose someone grabs this code and calls e() from c(), that is, creating a dependency from c() back up to e().

Image title

Figure 8: Our first messiness.

Recall that curved lines represent dependencies that go up-the-page and with c() now depending on e() we have the transitive dependency: a(0)b(1)c(2)e(1)f(2), in which the depth value decreases at one node. This transitive dependency is therefore not totally ordered, so we cry carbohydrate!

Thus we can now define our metric. No, not "spaghettiness." Let us channel our inner squares and call it "structural disorder." A transitive dependency is structurally disordered if it does not satisfy the total order properties above, and a program's overall structural disorder is then the percentage of disordered transitive dependencies.

Let's take this puppy out for a spin.

Looking at the two packages structures in Figure 2 once again, we would intuitively expect the structure the right to be far less disordered than that on the left, and it turns out to be so:

Image title

Figure 9: JUnit disorder is 76%, but Spoiklin Soice disorder is 3%.

Although we seek an objective measure, we nevertheless expect that as structures become subjectively messier-looking, their structural disorder values should rise. We can test this by taking two perfectly structured systems, "refactoring" them by applying random dependencies between nodes and checking whether their disorder values generally rise as their structures collapse. See Figure 10.

Figure 12a: A linear evolutionFigure 12b: A linear evolution

Figure 10: Two sad, decaying systems.

We can even simplify matters by defining the (admittedly arbitrary) categorization whereby a program suffering from 50% disorder or more is spaghetti. The threshold might have been 40% or 60% - feel free to choose your own. In fact, we'll have four categories, distinguished by garish, child-friendly color coding: red and black=naughty, green and white=nice.

Evaluation 0-24%
Good 25-49%
Fair
50-74%
Spaghetti 75-100%


Table 1: The four categories of structural disorder.

Let's point our disorder-binoculars at 15 Java programs, some quite well-known. Table 2 shows the programs and their structural disorder percentages on method-, class- and package-level. You'd expect most professionally designed programs to be "good" to "fair" on the disorder spectrum, so the table should appear overwhelmingly green and white, yes?

Program Method Class Package
Cassandra 41 82 84
Zookeeper 28 85 93
ActiveMQ Broker 24 80 89
Jenkins 26 72 90
JUnit 34 78 76
Camel 22 90 70
Lucene 33 70 73
FitNesse 33 55 61
Tomcat (Coyote) 22 81 40
Maven 30 30 74
Log4j 25 59 47
Struts 11 42 74
Spring 27 60 35
Netty 22 69 20
Spoiklin Soice 26 25 3
Average 27 65 62


Table 2: The structural disorder percentages of 15 Java programs.

Oh.

It seems that we, as professional programmers, can write more or less well-structured methods, but above that...RRRR MRRRR GRRRRD.

Three points are noteworthy.

First, we chest-bump endlessly about refactoring, yet refactoring definitionally involves just one thing: improving software structure. Table 2 suggests that we fail to consider refactoring at class- and package-level.

Second, higher-level structure can provide a model, a simplified view, of the lower levels: a good package structure, for example, can offer a great map of functionality without pushing the programmer's nose into foul code. Yet our higher-level models seem vastly more disordered than that which they model. Table 2 suggests that we fail to maximize the benefits of higher-level structure.

Thirdly, Oracle will release Java9 any day now (honest!) with its new modules, offering a level of structure above even package-level. Yet we apparently lack the desire or competence to manage the levels we already have. Table 2 predicts the rise of spaghetti modules.

Image title

Figure 11: Not another code review... (source here).

So, are we still writing spaghetti code?

Hell, yes! Not only are we still writing spaghetti code, we're living in the golden age of spaghetti code, an age in which we professional programmers don't just observe and casually ignore spaghetti, we don't even recognize it in the first place.

The GOTO statement used to be the alarm that forced programmers to manage control flow in their programs. Abandoning the GOTO statement, however, in no way removes this concern but rather migrates control flow to the realm of inter-method, inter-class and inter-package dependency where (in those last two cases, at least) its complexity now thrives, far from the programmer's gaze.

The greatest trick spaghetti code ever pulled was convincing the world that it didn't exist.

Summary

This is my structure. There are many like it, but this one is mine.

My structure my best friend. It is my life. I must master it as I must master my life.

Without me, my structure is useless. Without my structure, I am useless. I must design my structure true. I must design cleaner than my enemy who is trying to out-structure me. I must embarrass him before he embarrasses me. I will...

My structure and I know that what counts in programming is not the variables we rename, the methods we extract, nor the conditionals we replace with polymorphism. We know that it is reduced disorder that counts. We will reduce disorder...

The Agile Zone is brought to you in partnership with Sauce Labs. Discover how to optimize your DevOps workflows with our cloud-based automated testing infrastructure.

Topics:
java ,spaghetti code ,agile ,software development

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}