Key Takeaways: Adrian Cockcroft's talk on Netflix, CD, and Microservices
This article was originally published on 3/19/15
Join the DZone community and get the full member experience.Join For Free
One of the big draws of the O'Reilly Software Architecture Conference was Adrian Cockcroft's talk, "Deliver Faster and Spend Less with Cloud Native Microservices." Cockcroft is an experienced speaker on the conference circuit and he's well-known as the architect who led Netflix into its new era of unprecedented scale and agility.
He now works for Battery Ventures, but he still draws primarily on his experiences at Netflix for his talks. He and his team were the ones behind the greatest success story for the latest trend in software architecture: microservices.
It all started with a Netflix datacenter full of 'snowflakes' in 2008. After a major outage that year, they decided that they weren't very good at running datacenters, so they decided to let someone who was better at managing datacenters handle them. This 'someone' would eventually end up being Amazon Web Services.
There are Many Reasons to Invest More in Speed
The time has come for all companies to look at investing more into the speed of their services whenever possible. Cloud computing has made it easier than ever to deliver record performance at a fraction of the cost.
However, there are still some drawbacks that should not be overlooked. It is important for all business owners to carefully go over every aspect of their business at this time to make sure they are providing for customers in the best ways possible.
Cloud computing has made it easier than ever to bring crazy fast speed for a small fraction of the cost of what one might normally pay. Every company is rushing to create the next great thing for their customers, so it is on all of us to make sure we are doing our part to get ahead of the competition.
A Quick Story About Netflix
While everyone who's contemplating a move to the cloud or to microservices probably has a unique situation, Cockcroft encouraged new adopters to remember the theme of Netflix:
Start with the simplest possible thing you can transition, make sure it's not customer-facing, and test it in the new system. You should establish your risk boundaries, and move them forward as you have more successes. Use the smallest instance that still allows you to learn all that you want.
Things moved over to AWS fairly quickly for Netflix. By 2010 they knew they had to move most of their infrastructure over because they knew they would be out of capacity by Christmas. What they used initially was a hybrid cloud, but it involved a lot of things to manage in many different places. Cockcroft and his team compared it to a horse-rider riding two horses with a leg on each.
Visual example, for your amusement...
Today, Cockcroft is much more confident in the maturity of the public cloud, and says that the hybrid cloud step could probably be skipped in most organizations.
Netflix got a lot of their engineering talent because of their major open source effort. There were several advantages to this:
Engineers who worked on a lot of open source projects had high levels of creativity
Developers felt more ownership over their work, and pride in it
Open source developers work well together because of their similar ways of thinking
Peer pressure from GitHub—having their name on a project—was a big motivator for engineers to work harder and not let the community of users down.
If they leave, they're likely to keep working on the project, so you're still getting value for free!
Why was it a good idea for Netflix to open source so much of their architecture and associated tools?
It was actually a very smart move. They knew they were ahead of the game technologically, so they didn't want to get so far ahead of the rest of the industry that they'd have to synchronize their architecture and technology with some other future trend or open direction of the larger industry. They wanted to become that standard and future trend. And they didn't want to be the only "Unicorn," because they truly believe that they were not the only ones who can do this. (Today, Gilt, Twitter, Soundcloud, and others have proven this fact)
Business managers have been in love with the OODA loop for years. It was defined by John Boyd, a US Air Force colonel, and it stands for "Observe, Orient, Decide, and Act." It's especially relevant to the Lean Startup mindset , which is probably why Netflix really liked the concept. Cockcroft shared his own OODA diagram with related concepts in software development:
Observe = Research (gather data) & Innovation
Orient = Big Data analytics
Decide = Culture (JFDI: Just F***ing Do It)
Act = Cloud-speed provisioning
Continuous Delivery is at the core of the cycle.
The Site Reliability Team
With large software teams, it becomes impossible to pinpoint the developer who caused a bug without focused teams working on modular areas of the codebase. Many organizations have a reliability or monitoring team that fixes the bugs themselves or has to notify a bunch of people when something breaks.
The site reliability team at Netflix, didn't do any of those things. Their only job was to identify which microservice caused the bug, and then notify only the developer who was tagged as working on the malfunctioning piece of code (or their former team leader, if they're not working on that code anymore). They'll also organize a meeting of multiple developers if necessary, but the developers (not monitoring or the Ops/platform team) are on the hook for supporting their microservices.
The site reliability team is solely there to notify and organize the relevant developers (which they discover through monitoring each microservice) when something breaks.
The number of people inconvenienced by bugs and outages should be your metric. --Adrian Cockcroft
3 Kinds of Releases
Lots of developers have trouble with having daily or weekly releases. Most of their issues come from a misunderstanding of what 'release' means in Continuous Delivery. There are three definitions:
Putting some code in production (this is the release CD is talking about—fast micro-releases)
Customers start seeing the code (a small feature released to customers)
A major marketing release. (aggregation of micro-releases) It's all about marketing and its calendar-driven
Give the developers the pain, and they'll automate everything out of the way.
The Boundaries of Microservices
Check out Martin Fowler's post on Bounded Context
Read the DDD book by Eric Evans
A microservice's immediate connections and context should be something one developer can fit in their head
One developer should be able to independently produce it
One "verb" (single concern/function, not GET/PUT/DELETE) per microservice
Should be possible to deploy in a container
Hopefully, some of these key takeaways from Adrian Cockcroft's talk clarified some of your knowledge about microservices or agile organizational methodologies. There is even more content from this talk to discuss, and I'll include those other facets in their own focused posts. Check out this blog post for further examination of Adrian's key themes from his microservice talks.
Opinions expressed by DZone contributors are their own.