Avoiding Pitfalls When Doing Microservices
I understand that 'Microservices' when done well, results in great return (NetFlix, Google, Amazon, etc.) but there are many pitfalls in this journey.
Join the DZone community and get the full member experience.Join For Free
After being associated and worked on a Microservices based solution for more than a year both as a developer and leading development of a few services, I have more bad experiences to share rather than good ones. I understand that 'Microservices' when done better, results in great return (NetFlix, Google, Amazon, etc.) but there are many pitfalls in this journey.
Apparently, this journey when it becomes long enough makes executive management run out of patience and we start "Doing Microservices" rather than "Being a Microservices solution team".
You may also like: 3 Pitfalls Everyone Should Avoid With Microservices
I hope the following makes sense and helps you in your journey.
Don't Have Only One Supervisor
Don't have only one person in the role of the senior architect, supervising the entire microservices-based solution. Even if this is for a POC, it will eventually go into production. Once it is there, you bet your executives will give time to refactor and redesign. Business pressure is very realistic and drives much decision including technical.
In practice, this single person (master architect/main architect) will have too many things on his/her plate and won't have in-depth knowledge of all tech stacks or business domains. Lack of interaction with developers will gradually allow each scrum team to build its own castle. Don't worry about management, they are happy as long as ETA is met, something is marked 'yes' or colored green in some spreadsheet and a bunch of demos goes well.
If you don't want to end up as another prime example of Conway's law then have a team of senior engineers from different domains, who will set the technical guidelines and ensure that each team is following accordingly. Please don't have a situation where one team is using DES and another using AES.
Once you start growing, there is bound to be siloed among scrum teams or developers. Microservices anti-patterns feed on this culture and reduce your ROI.
No Upfront Plan for CI/CD for Current and Future Services
No upfront plan for CI/CD for current and future services. Skipping or delegating this step will result in many problems, not just technical but people to people. I found this part to be hardest of all because of the cultural change involve, dealing with multiple teams each having their own priorities and lack of common understanding. Just using Git, Jenkin, and Docker is NOT CI/CD.
If a severity-1 ticket opened by a customer in the production environment, is the moment to realize that your services can't scale, have many hardcoded elements then you have done it all wrong and are on backfoot, spending the next 3 months in a so-called hardening release doing nothing but fixing.
If you rely on dev-team-A to test and send a broadcast email that service-1 API has changed, broken the integration and can't be consumed then you are doomed. In the name of Microservice, you can't just go ahead and changed API in the assumption that an email sent by you would have informed everyone and they are ready. At least have some sort of backward compatibility and update automation.
In short, if your automation tests in "CI" and monitoring in "CD" don't detect such defects then stop calling it a CI/CD, just say it "social exercise for finding bugs because we have many people and enough time".
Coding to Meet ETA and Not to Monitor, Trace, and Debug
Coding to meet ETA and not to monitor, trace and debug. Avoid situations where a developer from Team-A/service-1 is asking developer from Team-2/service-2, how to verify the data ingested by service-1.
Avoid situations where you don't know the location of logs in prod, you even don't know if a defect reported from the customer is due to infrastructure, network, application or orchestrator. The log message format, file rotation, and persistence must not be left to individual scrum teams but inherited from common global settings.
You must have a solid design in place for every service to have observability, traceability and a centralized logging facility. Just ensure you make your ops team's life easier because they are the ones facing heat to ensure 99.99% availability. How difficult it must be to ensure that every service's log file must be using a common log message format.
Silos are inevitable which will let most people not have an end-to-end picture of the solution. Once you sort out the communication part then the debugging process becomes easier instead of each scrum team routing "the defect" to next hop (some other scrum team).
Remember, there is a reason, not every team using Spring Cloud is NetFlix...
Opinions expressed by DZone contributors are their own.