A few months ago on a whim I attended a Cisco-hosted meetup that unbeknownst to me was to have a profound influence on my life. The meetup title was simply "Microservices" and the first speaker up was Adrian Cockcroft.
Adrian is a luminary in the world of enterprise software, and during his stint as Chief Architect and Director of Web Engineering at Netflix he directed the top-to-bottom re-engineering of the entire netflix.com streaming video system, kicking off a cross-industry trend to do the same.
I'm honored to have had the opportunity speak with him for over an hour last month on the very topic this meetup addressed: microservices. A much condensed transcription of the discussion is here.
John: Hi Adrian, I appreciate you making the time to talk to me about microservices. I'll dive in and ask a few questions I've had after writing the blog series inspired by your talk.
Is there a benefit to adopting a polyglot ecosystem when building complex microservices-based systems, or is it better to stick to a single language stack?
Adrian: Actually Netflix is not all Java-based, and they did build out a python-based version of the platform. It’s on the list of things to be open sourced and has never made it out. Parts of the Netflix front end are written in Node.js.
To your point, to support polyglot Netflix did a couple of things. One is that they open sourced a generic sidecar process called Prana written in Java which interfaces to the platform. They've built custom sidecars for memcached (a C application), Elasticsearch and Cassandra. At various times they’ve experimented with REDIS, MongoDB and rabbitMQ. All of these things don't know they're running in this environment.
You put a sidecar on the instance, and the sidecar does the platform chatting and gathers data from whatever you're monitoring and sticks it in the standard monitoring tool. That's a fairly common approach which allows native polyglot support so you can deliver code in any language. Examples include pieces of the core personalization engine that are written in C++.
I think the other reason for polyglot is that people do have mixes of stuff. Core services are typically built around one language. For example Gilt has Ruby and Scala mixed together in different parts of the system because they were originally Ruby then they moved to Scala. I was talking to group last week who started with Ruby, but now are mostly Node, but they do have a bit of Java in there too. So it isn't a single monoculture, but it is more difficult than people expect to maintain multiple languages.
John: After I left the meetup it struck me that very few companies out there would have the resources or buy-in from management to “pull a netflix” -- that is, reorganize the entire company, dedicate teams of very smart engineers to re-architect to a microservices approach. Most companies are still in the dark ages, with major corporations still with Windows XP servers sitting under their POS system in their retail outlets, batch-transferring the day's transactions to the backend over a modem. Sure leading edge companies and thought-leading companies like NetFlix can make the leap, but how about the rest of the corporate world? How can they get on the board the microservices train?
Adrian: There are two or three points to make here. One is, we started figuring this out in 2008 and we were reading the core white papers by Google and Amazon and taking into account the things we learnt. So we made stuff up, and yes, there certainly were a lot of very experienced, very clever people on board. A bunch of us were over 50 so we've been around. It's not a bunch of young kids making stuff up and thinking we're the first people ever to have done it.
So nowadays the patterns are well-known, the tooling is well-known, there are lots of PaaSes out there, the ideas aren't radical any more...they're being socialized and they're being tested and you can find people who get it. And there are large scale companies using microservices in production.
So years later it's just much easier to be a follower than trying to work everything out from first principles. It's also much less experimental.The bits that are difficult are just getting easier to do. There are still interesting problems, but it's much easier to do. That's one thing.
The other thing is I'm seeing a lot of big boring sounding companies who are well down the microservices path, much further than I thought. Watch the videos from DevOps Enterprise summit. It was a mind-blowing conference for me. I was in the audience, I didn't present there.
Nordstrom's was on stage, saying “OK we figured out how to optimize for speed instead of cost.” The department of Homeland Security talked about running chaos monkey on their production on the systems for processing Green Cards. We’re talking about your green card application, the immigration naturalization part of the US government running Chaos Monkey on their production systems! I almost fell off my chair.
I was like "really?" Mark Schwartz, their CIO, was there and his presentation is so amusing, you just have to watch it. You do not expect a government CIO to be entertaining, but it was mind blowing.
Target, Macy's, Raytheon, a big bank in Canada, a major finance company, were basically talking about what it was like.
What happens is that somewhere in the company somebody says "we need to go faster, let's try this DevOps thing," or they get a hold of a copy of Gene Kim's Phoenix Project and sort of beat a few people around with it.
The Target guy said he bought 23 copies of Gene’s book and made everyone read it and started play acting scenes from it during an offsite. Everyone agreed this was probably going too far, but the principle is there. The idea gets in there and somebody convinces someone else who can just go off and try one project this new way.
And they measure and observe it, and they found they typically went from releasing once a quarter or once every six months to get a product release, to releasing a hundred times a quarter.
They're seeing two orders of magnitude increase in release frequency which means the size of the chunk of work being released is a hundredth as big as it was before. The flow works because these units of work are small, easy to understand and easy to roll back. The number of production issues and outages they have is an order of magnitude less than it was before.
So they're releasing hundred times more often with a tenth of the total outages per quarter which seems counterintuitive. I'm releasing a hundred times faster and it's breaking a tenth of the amount?
What does all this cost?
It turns out it costs about a half of what they used to spend because they're not spending as much effort building things that aren't needed, not buying machines aren't needed. Turns out you actually can get more done with the same or less people, and you're delivering products to customers that they like better and you’re typically doing it faster than your previous teams could.
So you're getting stuff done quicker, you’re using fewer engineer days to do the work and you're doing it more effectively using fewer machine days to roll it out and make it work. The whole system becomes more efficient with less waste which is the whole principle of agile.
Stay tuned for Part II of my interview with Adrian, where he discusses a handful of other microservices topics including private vs. public cloud, tooling, PaaS, and more.