Okay, this is the separate blog post that I referred to in Software architecture vs code. What exactly do we mean by an "architecturally-evident coding style"? I built a simple content aggregator for the local tech community here in Jersey called techtribes.je, which is basically made up of a web server, a couple of databases and a standalone Java application that is responsible for actually aggegrating the content displayed on the website. You can read a little more about the software architecture at techtribes.je - containers. The following diagram is a zoom-in of the standalone content updater application, showing how it's been decomposed.
This diagram says that the content updater application is made up of a number of core components (which are shown on a separate diagram for brevity) and an additional four components - a scheduled content updater, a Twitter connector, a GitHub connector and a news feed connector. This diagram shows a really nice, simple architecture view of how my standalone content updater application has been decomposed into a small number of components. "Component" is a hugely overloaded term in the software development industry, but essentially all I'm referring to is a collection of related behaviour sitting behind a nice clean interface.
Back to the "architecturally-evident coding style" and the basic premise is that the code should reflect the architecture. In other words, if I look at the code, I should be able to clearly identify each of the components that I've shown on the diagram. Since the code for techtribes.je is open source and on GitHub, you can go and take a look for yourself as to whether this is the case. And it is ... there's a je.techtribes.component package that contains sub-packages for each of the components shown on the diagram. From a technical perspective, each of these are simply Spring Beans with a public interface and a package-protected implementation. That's it; the code reflects the architecture as illustrated on the diagram.
So what about those core components then? Well, here's a diagram showing those.
Again, this diagram shows a nice simple decomposition of the core of my techtribes.je system into coarse-grained components. And again, browsing the source code will reveal the same one-to-one mapping between boxes on the diagram and packages in the code. This requires conscious effort to do but I like the simple and explicit nature of the relationship between the architecture and the code.
When architecture and code don't match
The interesting part of this story is that while I'd always viewed my system as a collection of "components", the code didn't actually look like that. To take an example, there's a tweet component on the core components diagram, which basically provides CRUD access to tweets in a MongoDB database. The diagram suggests that it's a single black box component, but my initial implementation was very different. The following diagram illustrates why.
My initial implementation of the tweet component looked like the picture on the left - I'd taken a "package by layer" approach and broken my tweet component down into a separate service and data access object. This is your stereotypical layered architecture that many (most?) books and tutorials present as a way to build (e.g.) web applications. It's also pretty much how I've built most software in the past too and I'm sure you've seen the same, especially in systems that use a dependency injection framework where we create a bunch of things in layers and wire them all together. Layered architectures have a number of benefits but they aren't a silver bullet.
This is a great example of where the code doesn't quite reflect the architecture - the tweet component is a single box on an architecture diagram but implemented as a collection of classes across a layered architecture when you look at the code. Imagine having a large, complex codebase where the architecture diagrams tell a different story from the code. The easy way to fix this is to simply redraw the core components diagram to show that it's really a layered architecture made up of services collaborating with data access objects. The result is a much more complex diagram but it also feels like that diagram is starting to show too much detail.
The other option is to change the code to match my architectural vision. And that's what I did. I reorganised the code to be packaged by component rather than packaged by layer. In essence, I merged the services and data access objects together into a single package so that I was left with a public interface and a package protected implementation. Here's the tweet component on GitHub.
But what about...
Again, there's a clean simple mapping from the diagram into the code and the code cleanly reflects the architecture. It does raise a number of interesting questions though.
- Why aren't you using a layered architecture?
- Where did the TweetDao interface go?
- How do you mock out your DAO implementation to do unit testing?
- What happens if I want to call the DAO directly?
- What happens if you want to change the way that you store tweets?
Layers are now an implementation detail
This is still a layered architecture, it's just that the layers are now a component implementation detail rather than being first-class architectural building blocks. And that's nice, because I can think about my components as being my architecturally significant structural elements and it's these building blocks that are defined in my dependency injection framework. Something I often see in layered architectures is code bypassing a services layer to directly access a DAO or repository. These sort of shortcuts are exactly why layered architectures often become corrupted and turn into big balls of mud. In my codebase, if any consumer wants access to tweets, they are forced to use the tweet component in its entirety because the DAO is an internal implementation detail. And because I have layers inside my component, I can still switch out my tweet data storage from MongoDB to something else. That change is still isolated.
Component testing vs unit testing
Ah, unit testing. Bundling up my tweet service and DAO into a single component makes the resulting tweet component harder to unit test because everything is package protected. Sure, it's not impossible to provide a mock implementation of the MongoDBTweetDao but I need to jump through some hoops. The other approach is to simply not do unit testing and instead test my tweet component through its public interface. DHH recently published a blog post called Test-induced design damage and I agree with the overall message; perhaps we are breaking up our systems unnecessarily just in order to unit test them. There's very little to be gained from unit testing the various sub-parts of my tweet component in isolation, so in this case I've opted to do automated component testing instead where I test the component as a black-box through its component interface. MongoDB is lightweight and fast, with the resulting component tests running acceptably quick for me, even on my ageing MacBook Air. I'm not saying that you should never unit test code in isolation, and indeed there are some situations where component testing isn't feasible. For example, if you're using asynchronous and/or third party services, you probably do want to ability to provide a mock implementation for unit testing. The point is that we shouldn't blindly create designs where everything can be mocked out and unit tested in isolation.
Food for thought
The purpose of this blog post was to provide some more detail around how to ensure that code reflects architecture and to illustrate an approach to do this. I like the structure imposed by forcing my codebase to reflect the architecture. It requires some discipline and thinking about how to neatly carve-up the responsibilities across the codebase, but I think the effort is rewarded. It's also a nice stepping stone towards micro-services. My techtribes.je system is constructed from a number of in-process components that I treat as my architectural building blocks. The thinking behind creating a micro-services architecture is essentially the same, albeit the components (services) are running out-of-process. This isn't a silver bullet by any means, but I hope it's provided some food for thought around designing software and structuring a codebase with an architecturally-evident coding style.