So it’s been four years since microservices architecture first captured our attention. It was promised as an architectural style that lets us compose small, unique services around business domain problems to build complex and wonderful solutions. A network of interconnected services choreographed beautifully to give our users an uninterrupted experience.
0:30 Collaborate via Choreography
This was a grassroots movement, and our developer community felt quite passionate about it. Obviously, we embraced its complexity and built a healthy ecosystem of tools and techniques to let us automate delivery pipelines in a federated manner and allow us to release code from commit to production for the first time in a truly independent way. So, for the last few years, my coworkers and myself at ThoughtWorks have been building microservices architecture.
Today I’m going to share some of those real world challenges that you will face, especially in large organizations. Hopefully for those, I will give you a few tactics to maneuver around these obstacles.
2:30 Battling the Monoliths
The first one is an organizational problem; it’s how the teams are formed, how the budgeting is allocated, and how the money flows in large corporations.
The second one is the battle of monoliths – battling legacy systems and migrating into an architecture oriented to small services.
Once you’ve gone through these two obstacles, now you have to curb the enthusiasm of your new starters getting excited about modeling services; modeling services is hard, especially moving too fast towards too fine‑grained services without the operational readiness.
If you think that the main challenge you have for building microservices architecture is a technical problem – coding, design, and DevOps practices – think again. It’s an organizational problem. This is a recurring theme for many of our clients; different parts of the organization get together to fight for a budget for a list of projects they have for the year. It’s called annual portfolio planning.
Once they get budget for their projects, they start forming ephemeral teams so each can go off and implement this single project. When the project’s delivered, the team dissipates and they move onto the next project, and the cycle continues.
So what’s wrong with this picture? Well, if you want to rearchitect your system towards microservices and you have to go through an annual budgeting, it’s very difficult to build a business case to let you migrate architecture. It’s almost impossible without showing that you can deliver business value with that rearchitecture to your consumers.
Once you got that business case in place, your management convinced, and got money for rearchitecture, the problem is that you don’t have long‑standing teams to own the services in an autonomous way.
So how are we going to go around this? We first start with creating a business incentive. In many organizations, it’s the scale of operation; it’s like Netflix going from renting videos to streaming movies online to millions of people. In some organizations, it’s the speed of delivery. You want twice the speed of delivery that you have today. Maybe you want to expand your business into new domains and be able to compose solutions to pivot your business.
So whatever it is, be very clear about it, and bring your product and business people along on this journey. Articulate and align your business initiatives with your architectural>
6:09 Building a Model for Migration
Start with a small hypothesis as to how this architecture is gonna get you going faster, and start building a proof of concept for that.
From there you go through expansion. You expand that model to other projects and other teams and start continuously evolving from a project‑based mindset to a product‑ and service‑based mindset &ndash from short‑term ephemeral teams to long‑standing teams. And the cycle continues. It’s an evolutionary, iterative way to wear out the grooves that old organizational thinking and old structures have put in place.
Once you go through your organizational problem, the next one up, one of the hardest, is how to deconstruct the old legacy systems. It’s a question of: how am I gonna decompose a system that is continuously under change, and how am I gonna do this in a financially viable way?
To give you an example, my current client tried to pull a very simple crowd‑based service out of a legacy system with 2.5 million LOC that was built over the last ten years, and it cost the same amount of dollars as the lines of code in that legacy system; that’s not financially viable.
The migration becomes a continuous trade‑off between: shall I extract code and reuse, or shall I retire code and rewrite new code? You have to do this continuous cost‑benefit analysis as you go through experimentation with either model.
8:20 Find the Seams
So I’ll give you a few heuristics to go about this deconstruction. The first one is: find the seams that might exist around the business capabilities in your existing system. The picture you see above is an analysis I did on a .NET code‑based package dependency to see whether I can find these seams and boundaries of business capabilities to pull out.
Unfortunately, in this one, I really couldn’t see any clear boundaries. Use a structural analysis tool or runtime analysis tool to find the seams.
8:51 Measure Toxicity
Don’t forget the toxicity and liveliness of your code. What this diagram shows, what we were trying to find out, is what the toxic parts of the code are, or healthy parts of the code that are alive and being used. The black circles are toxic code underuse.
11:24 Deconstructing the Monolith
So, we talked about finding seams around bounded context, using the theory of constraints. You want to go fast, you want to scale parallel development; what is it in your system that is the most limiting factor? Find those hot spots and remove them.
To give you an example, I’m deconstructing a big monolith right now. It’s a web‑based commerce system and one point of contention is session management: logging users, identifying users, and associating attributes with a user. That’s coupling all the other user‑centric services to this monolith. So we tried to pull that out first.
Apply the strangler pattern. There is a healthy dose of documentation around this, but the idea is that you start building new capabilities or replicating the capabilities by writing new code around your services. Put an NGINX server in front of it and reroute your direction to new capabilities vs the old one, until the old one can be retired.
This is another technique that I’ve started experimenting with – I call it monolith in a box. The idea is that sometimes we don’t need to change the monolith, but we need to run it and execute it, and test the systems we’ve built around it. With these kinds of old systems, testing, execution, and deployment is really difficult. You have configurations scattered all over your infrastructure, business logic in your five load balancers.
So, what we’re trying to do is really wrap the monolith and all the configuration and system dependencies into containers so the developers can just spawn a container, never worry about building and configuring it, and just talk to it.
Lastly, refactor what matters; if you have gone through the experience of aligning your architectural needs with your business objectives, you should have a good idea where the need for invention, innovation, and experimentation is. So, start pulling out those parts of code or reimplementing those.
12:13 Service Modeling is Difficult
We’ve gone through battling the organization’s structure, getting money for our architecture, found ways of working around this monolith, and now we get to model our new service. But, a pattern that we see over and over again is getting too fine‑grained too early.
The question that I often get asked is: So how big is this microservice? The idea of a microservice is that it’s an independently releasable service around your business domain concept. Some people say, “Oh, six people, six services.” That might be too small. The one I like is that you can kind of replace it in two weeks, depending I guess on your programming language and infrastructure.
But, the point is, it has to be as small as you can handle. What I mean by that is if your organizational maturity in terms of automation and streamlining the development process is not just there yet to handle more than two services, then two services are probably too many.
So, make sure that you encapsulate business logic in the service and you don’t leak that to your consumers. Build the automation, and handle distributive systems in such a way that you can have a healthy size, and the size can change over time.
No utility services please. I see these services that remind me of my messy kitchen drawer where things, concepts that [don’t] quite fit in other domains, get pushed into it – a config service, utility service, reference data. If you find yourself building one of those, stop, think again, because it becomes one of those bottlenecks very soon as a shared service.
I’ve seen people put country codes in a service as reference data. How often do we build or create new countries? Not that often. So maybe duplicating that information across services is perfectly fine. Or if you have any other capabilities that you want to put in that service, think about the domain contexts that it belongs to more closely.
And with the best intentions, it might happen that we choose the wrong boundary for our services. Have you seen those stressed parents on a flight? Where for some reason the two toddlers are in the front of the plane and the parents are sitting in the back of the plane? It happens, right?
So, how do we find these right boundaries? If we sit around the table, and think academically, “What are these beautiful domain concepts?”, we probably won’t end up with the right boundaries. My suggestion is: use your user experience and user journeys. What are the real use cases of your system? Walk through them and build domain services that support those.
Test the boundaries by saying, “OK, if I’m building these kind of related features for my end user; how many services do I need to touch?”
For instance, if I’m a mobile provider and I want to support international roaming, there are a whole bunch of capabilities around that. How many services do I need to change? If you’re changing more than three services, you probably got the boundaries wrong.
You can’t really realize the idea of independence. Even if you got the boundaries wrong, that’s fine because the REST Level 3 maturity model will come to save you. If you use hyperlinks between the aggregate groups and their subnodes, we can move that around; we have more flexibility in decoupling the services or joining them and not impacting our consumers because our consumers are just following the hyperlinks.
17:24 Constructing Microservices
And I can’t emphasize this enough, to say that we should not build any services without having an automated path to release. We started building microservices for a pizza enterprise company in Australia a couple of years back and we were so excited, we went from no services to sixty services in three months.
We had all these independent build pipelines, we had some principles locked down early on, which was really good. We said, “No single line of code, service code, without a contract test.”
So the first things that we built, if we thought of a new service, [the first thing we asked] was how the consumers were gonna use it, and write a contract test for it. Then we would write a mock‑up, or a prototype, to show how the request‑responses would flow through the service. Then we would write the service code, and as we do that, we’d build the delivery pipeline and automation for that.
We made a few mistakes; If I could go back, one of them was: we compromised on a few things. We compromised on debuggability. You know, using correlation ideas so we can debug our service’s life. I’d probably build that in early on. We did have good monitoring and health checks, but we didn’t have good debugging built in place.
We invested early on in scaffolding, some people call it chassis or service templates. Basically, all the boilerplate code that you need to build to put your service logic in the shell. So, monitoring, exposing health matrix, structured logging, authentication, and security.
21:06 Find your Granularity
So, we talked about the different values of microservices, but we need to acknowledge the complexities that come with it. We love the idea of autonomous teams, but teams are made of people, people form groups, and silos happen. So communication overhead comes with that autonomy.
We love microservices because it lets us go fast. We can build a single small service to production very quickly; but, to be able to do that, there is an execution overhead and complexity. All the automation that we have, we put in place to be able to deliver it seamlessly.
We like microservices because it gives us scale; now you can break down the domain problem into small subdomains, allocate teams to each subdomain, and give them autonomy.
Building distributed systems is hard. Building resilience, consistency, and all of those things bring some complexity to your infrastructure. We expose wonderful fine‑grained APIs that let our consumers compose different solutions. The composability gives us new ways of building systems, but now we have to maintain a range of APIs and backward compatibility, and that maintenance has its own overhead.
We can use different technology stacks for every service as long as it talks and walks like a service from outside. I can use whatever tech stack I want for the service, but now our operation team, or more generally the organization, needs to deal with a proliferation of technology stacks.
So, what I would suggest is get one of these giant granularity sliders and really find the spot where the number of services and granularity of the services for your organization yields a net positive value. So, if you’re going too micro, you need to bring the automation and the infrastructure to support that. So find that right spot for where you are and where your organization is.
22:07 Leap in our Empathetic Evolution
Lastly, I would leave you with this: we have talked about the softer side of architecture: the complexity of communication between people, people who build the services and people who consume the services. That communication overhead causes friction between the teams.
So I argue that for us to be able to successfully deliver microservice architecture, we need to make a leap in our empathetic evolution. We need to adopt empathic development practices that really let us understand the needs of the consumer.
So there are a whole bunch of techniques that are already out there. Build APIs that your consumers actually need, get the consumers to give you contract tests to validate your assumptions about how they are using your service. As service providers, provide mock‑ups to the consumers so that they can run their applications independent of integration test.