[This article was written by John Wetherill.]
This is a continuation of the Microservices and PaaS - Part I blog post I wrote last week, which was an attempt to distil the wealth of information presented at the microservices meetup hosted by Cisco, with Adrian Cockcroft and others presenting.
Part I provided a brief background on microservices, with a summary of some lessons learned by microservices pioneers.
In this installment I will cover a number of practices related to microservices that were discussed during the meetup.
A followup article will dive into the advantages that Platform as a Service brings to microservice development.
I'm calling these "Microservice Practices," not "Microservices Best Practices" because microservices-based architectures are still evolving, with new practices, techniques, tools, and patterns emerging constantly.
At the meetup a number of practices were highlighted that Netflix and other microservices pioneers have spearheaded in their efforts to adopt a microservices mentality across their organizations.
Break Things Deliberately
According to Netflix: "We have found that the best defense against major unexpected failures is to fail often." Netflix has brought us "Chaos Monkey" which is a powerful tool the sole purpose of which is to break things, often and randomly. They use this tool continuously on their production systems to bring down essential services, to ensure that doing so doesn't disrupt the user experience or their overall service. It's much better to deliberately break the system in the middle of the morning when all teams are assembled and sufficient caffeine has been consumed, than to be informed of a breakage by a page at 3am.
No Manual "Anything"
In a world where microservices come and go, grow and shrink, and migrate around racks and data centers in seconds - there's absolutely no room for manual intervention. All aspects of deployment, monitoring, testing, and recovery must be fully automated. For example, monitoring a service should occur instantly and automatically by virtue of it being deployed, not requiring a separate manual step.
Similarly failure discovery and rerouting to old code, as described in Part I of this blog, must be fully automated, no human intervention required.
Respect Human Attention Span
Speaking of humans, a typical human's attention span, say when filling out a shopping cart, is around 10 seconds. If a failure occurs when deploying an updated shopping cart microservice, it's important that the time between the failure, reporting, and rerouting to existing, working code is kept under around this 10 second range. Obviously this shouldn't happen too often, but the occasional 10 second gap in response will probably not lose the customer. A five minute, or 5 hour lag, resulting from manual intervention and rollback, will.
Denormalize like Crazy
Refactor database schemas, and de-normalize everything, to allow complete separation and partitioning of data. That is, do not use underlying tables that serve multiple microservices. There should be no sharing of underlying tables that span multiple microservices, and no sharing of data. Instead, if several services need access to the same data, it should be shared via a service API (such as a published REST or a message service interface).
Each microservice can have its own persistence layer. Gone are the days of a single monolithic database instance that's shared across all parts of an application. Databases are getting cheaper and easier. As an example, Neo4J allows you to embed an industry-strength self-contained graph database in your microservice at the cost of a few megabytes in a jarfile, with startup time on the order of milliseconds. That's essentially free.
Even better, any PaaS worth its salt will provide multiple database services that can be spawned and accessed at the drop of a hat.
With technology like this at our disposal, it makes sense to use the persistence layer that fits, both to the problem being solved, and to the expertise - and passions - of the team that's solving the problem.
Avoid Trunk Conflicts
The old mindset had all code for a large project contained in a single source repository. This can be slightly easier to setup and manage, but it ties the microservices together and makes it much more difficult to evolve them independently.
Instead each microservice should have its own scm repository so it can truly be updated and enhanced independent of other services.
One Service, One Manifest
Each microservice must have its own manifest and dependencies, instead of maintaining a global dependency list for all services. This allows, for example, one microservice to depend on Spring v3.2, while another can require Spring 4.1. The dependencies for one microservice can change over time with no effect on the dependencies of other microservices.
All microservices should run in a container, such as Tomcat, Docker, or in whatever container system is provided by the PaaS (you are running a PaaS aren't you?). Do not run microservices on bare metal, or directly on a VM. Containerization brings countless advantages, particularly a consistent, isolated runtime environment that can easily migrate around the datacenter or around the globe. With Docker and other modern containerization approaches, there is very little overhead in running in a container, and considerable upside.
Do not build stateful services. Instead, maintain state in a dedicated persistence service, or elsewhere. This is a well-known practice brought to us by the cloud. When an application instance maintains state, it can't easily be moved, scaling is more complex, and it's more likely to cause problems when it fails. This practice applies even more to microservices which in general should be light-weight, instantly replaceable on failure, and should be able to hop around data-centers.
Don't Name your Chickens
People who raise chickens soon learn that naming chickens is a bad idea: after naming a chicken you get attached to it, at least the kids do, and it can be uncomfortable to have to explain at the dinner table that the chicken pot pie is really "Molly." Instead, number your chickens, so you can say "that was chicken #38" or even better, "that was chicken 586ec9bd." Makes for a much more enjoyable meal.
The same can be said of computer systems. Do not name systems after planets, or animals, or philosophers, or prisons, as was common practice in the UNIX world for decades. Instead, assign them guid's, and don't attach any sort of significance to them, like assigning them specific roles or purposes. Systems should be commodities, like McDonalds Franchises. Each McDonalds is eerily similar, with the advantage that if one shuts down you can just walk an extra few blocks and be served the exact same burger at the same price in the same amount of time.
Create and Curate Access Libraries
Microservices are accessed by externally published APIs or protocols. This allows the microservice implementation to completely change with no effect on its consumers, as long as the API remains constant.
But just publishing an API is not enough. The microservice provider should also be responsible for building and stewarding client libraries used to access the service.
If this is not done, the construction of these libraries will be left to third parties, and will likely result in fragmentation where various implementations might have slight differences, or implementors may incorrectly interpret the spec and introduce inconsistencies which then stick.
Optimize the Interaction
One downside of a microservices architecture is the "fanout" problem where a single request to the overall application results in 10 or 20 requests bubbling throughout the various microservices the application relies on. This dramatic increase in network traffic calls for more optimal communication between microservices. Instead of transmitting the standard text/html REST content type, consider using something like Google Protocol Buffers, Simple Binary Encoding, or Apache Thrift, to decrease the size of the payload and optimize the inter-microservice communications.
Release the Monkeys
Netflix has released what they call the "Simian Army," a suite of tools including Chaos Monkey, mentioned above, whose purpose is to help an organization build resilient, scalable, fault-tolerant software. The suite includes such tools as Janitor Monkey, to reclaim unused resources, Security Monkey which looks for security vulnerabilities, Latency Monkey, which induces artificial delays in the REST layer to scare out latency issues, and many more.
As Phil described last week in his blog Devops: Tools vs. Culture, most organizations don't have the resources or luxury of being able to build their own toolsets when evolving to a microservices and devops culture. Instead they must leverage existing tools, and fortunately lots of tools are constantly appearing. It's worth spending the effort searching and researching these tools, and incorporating them into your overall development process when they make sense.
To be continued... Again
I originally intended to cover last week's microservices meetup in a single blog post, which then expanded to two. I have yet to address the power of PaaS in microservices architectures, and I'm out of space already.
So I will continue this Microservices and PaaS theme next week, finally getting into PaaS, and discuss how Platform as a Service can significantly streamline the microservices development process.