We’re not afraid to admit it. Our success was killing us.
New Relic offers great SaaS-based APM software, but we were struggling to provide fast response time as we grew to serve trillions of event queries a day. How containers helped enable our journey from a single monolithic application to a modern, scalable software stack can provide valuable lessons for enterprises making similar journeys.
I will touch on the basic story here, but to learn how we scaled up using containers and microservices, check out my recent webinar, which is embedded at the bottom of this post.
Setting the Stage
New Relic now has more than 13,000 customers generating 2 million events a second. That’s a lot of data to handle, and dealing with it wasn’t always easy.
We started off with a true monolith. We had a single application that ran our entire business. It contained the agent, the data-collection pipeline, and the web interface. As New Relic grew, we divided that out into two applications—our “duolith.” One part was a Ruby on Rails web application, which is the user interface. The other was a Java data collection pipeline.
Over time, however, as our customer base grew, as our feature set grew, and as our company grew, we started to experience the kinds of problems familiar to many growing companies. Small fixes took a long time to get into the code base and to customers. Engineers were colliding with each other trying to create new features. Service quality suffered because of communication problems. As New Relic’s success continued to grow, so too did those problems. Eventually, the New Relic Engineering team decided to address them by moving to a service architecture.
(Would developers create new services? We expected to have a few dozen services to be created. As of May 2016, we were running more than 200!)
Deploying Services Faster
Containers—specifically, Docker—were a big part of our approach. Docker gave the New Relic operations team a standardized way to manage the hundreds of services. As Karl Matthias and Sean P. Kane write in their book Docker: Up & Running, Shipping Reliable Containers in Production:
It is hard and often expensive to get communication and processes right between teams of people, even in smaller organizations. Yet we live in a world where the communication of detailed information between teams is increasingly required to be successful. A tool that reduces the complexity of that communication while aiding in the production of more robust software would be a big win. And that’s exactly what we found with Docker.
For New Relic, Docker’s advantages have played out in terms of vastly increased agility. With containers, developers can provide new functionality in smaller and smaller units, constantly speeding the time from idea (or fix) to consumption by users. Just as important, since different pieces of the system may be swapped out or scaled up without affecting the rest, we can roll out releases without downtime.
Docker => Synthetics
A great example of our use of Docker is New Relic Synthetics. Designed so you can proactively monitor your applications, each Synthetics script runs in its own Docker container, and the container goes away as soon as the script is done. This approach is designed to enhance performance (the container spins up very quickly), and security (the container is alive for such a short time it is harder to hack).
Docker also provides a level of abstraction that is designed to make it much easier to scale. Containers are much more nimble than Virtual Machines (VMs). It’s easy to deploy new containers programmatically, and our Docker system is elastic; it grows and shrinks automatically depending on the load. Finally, since containers run only one kernel, they have much less overhead than VMs. Doing the same jobs, VMs are much more CPU and memory intensive.
Watch the Webinar
For specific tips and insights into the lessons we learned on our epic journey to containerization, watch the webinar below:
Bonus: Get Docker Training
Learn how to get Docker up and running with online training from our friends at O’Reilly. The course features Sean Kane, New Relic senior site reliability engineer and co-author of Docker Up and Running. The course is offered June 28 & 29, 2016, at 10:00 a.m. PT. Find out more here.