The Biggest Misconception About Microservices
From a lack of knowledge of other styles of architecture the trade-off, the microservice style is misunderstood.
Join the DZone community and get the full member experience.Join For Free
As we are proceeding more often towards a micro-service architecture, it seems design teams often forget there are more architectural styles that can be used. Often the monolithic style (which is no to be confused with the ‘big ball of mud’-architecture), the service-oriented architecture, layered architecture, blackboard architecture, etc are ‘forgotten’. From a lack of knowledge of other styles of architecture the trade-off, the microservice style is misunderstood.
The main problems (among others) I encounter while working with micro-services are:
- Too granular services causing an unmaintainable and inoperable landscape.
- A so-called distributed monolith where multiple services need to be changed when functionality needs to be changed.
- Unnecessary data replication and duplication causing data leakage, stale data, etc.
In this article, I will focus on the third problem as it seems this is the least understood problem. This problem often provided by the idea of the run-time autonomy of the software being developed. In some cases, this is indeed required from a resilience perspective. A service needs data to be available more than the system maintaining the data can stand in for. This could be the uptime of the system, the performance of the system, etc. Meaning: The system can’t adhere to the quality requirements of the consuming system. In these cases from a “quality perspective,” this data duplication within a service could be a good solution. But there is a trade-off to this solution.
The data in the consuming service needs to be maintained. And while this seems rather easy with event publishing systems like Kafka there are some catches to this trait. For starters: the data in the consuming system needs to be up-to-date. The data needs to pull out of the system of record, holding the data. As per the first argument “the system can’t adhere to the quality requirements of the consuming system” we are pretty safe to assume this system is a bit outdated and probably is not able to publish events natively. This is where Change Data Capture (CDC) solutions often come into play bringing in another complexity for the data-duplication trade-off. As said: there is a solution in which these trade-offs are a necessity.
What I encounter in foundational micro-service architectures (by which I mean the main architectural style is micro-services and there is not too much legacy to comply with) is that also in these cases the micro-services rely on their own copy of data. But, if we have to build horizontally scalable services (one of the principles of this architectural style), why do we have to copy the data? Via messaging (be it REST, gRPC, COM+ or CORBA aaaw who remembers CORBA) we don’t need to copy data to a service. We just pick up the data from the source service via a well-defined contract (and using consumer contract-driven testing to omit regression problems_.
9 out of 10 the argument will be “what if the producing service is down? Then this solution will hurt our solution as well”. In most cases the services run on the same cloud infrastructure, so what are the odds that the service is down and yours is not? Yeah sure, errors may be made during development and operations but these are easily mitigated with canary releases, etc. The question is: what is the bigger trade-off? Duplicating data, having to manage the data, or making sure that services have a high enough availability.
The second argument often heard is that “there are so many services we rely on… this really hurts our uptime. If a system is available 99% then five systems working together have the availability of 99⁵ = ….”. Also, this is true, but if this is the case have you really chosen the right boundaries for your services? What reason is there to split the system into 5 parts working closely together (being coupled)? And is it worth the trade-off of becoming less available of having two maintain data living throughout the many services?
Have you considered working on making sure there is enough bandwidth to provide the data from the source service of making it horizontal scalable in order to upgrade its availability? If none of there solutions (decoupling, bringing cohesive functionality under one roof, upgrading availability or bandwidth) are better, cheaper than duplication data. Then this is the moment you should switch to duplicating the data. Nb, I could have added the organizational perspective (teams maintaining the software) to the equation as well, but I imagine you get the point.
Well, consider thinking of it, isn’t organizational scalability the essence of moving towards a micro-services architecture? If it is not, why not work in a good set-up monolithic architecture? Using a feature-based packaging structure, easily maintaining the software. The performance issues occur for which parts of your monolith require horizontal scaling, this is the moment to take out the package into its own deployable unit and rewrite service and repository layers to send commands and gather data from the new service perhaps using pipes and filters architecture to perform the transformational actions required.
So, before duplicating data in order to become ‘autonomous’ at run-time consider the trade-off: it is not a light one to be taken!
Published at DZone with permission of Sven Vintges. See the original article here.
Opinions expressed by DZone contributors are their own.