Best Practices for Docker and Microservices at Scale
Best Practices for Docker and Microservices at Scale
Learn what a panel of experts had to say on both the good and the bad of using Docker and microservices together at scale.
Join the DZone community and get the full member experience.Join For Free
Containerized Microservices require new monitoring. See why a new APM approach is needed to even see containerized applications.
The original post can be found on the Electric Cloud blog.
Our expert panel included:
Andreas Grabner, Technology Strategist at Dynatrace.
Chris Haddad, Chief Architect at Karux LLC.
Chris Riley, Analyst at Fixate.io.
Esko Luontola, Programmer and Interaction Designer.
Phil Dougherty, CEO at ContainerShip.
Our very own Anders Wallgren and Sam Fell.
During the episode, the panelists discussed the benefits of microservices and containers, some of their challenges, and best practices for building, deploying, and operating microservices on a large-scale Docker-ized infrastructure. Continue reading for their insights!
Docker and Microservices: Why?
Dougherty talked about microservices and Docker compatibility. “When you’re breaking down a monolith into small, composable services, Docker is going to come in handy. Why? Because you might have many different teams that are all working on individual services. Being able to package them up and share them amongst the group, the development team, compose them together, and be able to work on them easily is extremely important. If you’re going to do these constant deployments of smaller services instead of, every six months pushing out this giant monolith, you want to have a way to do that easily and have immutable artifacts that you can easily push out and deploy. So, they really kind of go hand-in-hand to, you know, making microservices easy to push out.”
Fell added to Doherty’s comment. “The immutability factor, I think that’s one that a lot of people would agree with. It’s important if you want that parity or that fidelity across the pipeline. The ‘It worked in my environment’ sort of argument largely goes away.”
Grabner chimed in on microservices and Docker. “Talking about pipeline speeds, we use Docker heavily for testing, for speeding up our pipeline — by parallel executing tests, by being able to test individual services, isolation, very often with every check in. I think this is just great, shifting left, finding problems early on. And I think Docker is a great enabler. If you build a microservice architecture, yes or no, that obviously then brings another benefit. But I believe Docker itself is already very beneficial in that case.”
When it comes to the synergy between Docker and microservices, Haddad added more. “If you’re really going to transform your digital business, you need to actually take a top-down approach. Define out your domains, and then containers enable you to point at a runtime executable, and even better yet, trace it back to specific git tags to say that I am rapidly incorporating new features into this domain object that’s placed in a container. That’s where the intersection is. You can conflate the two or keep them separate. Containers don’t equal microservices, but there are ways to piece it together, where there’s a natural peanut butter and jelly or synergy between the two.”
Riley added insight into the cultural aspect. “It also comes down to culture, teams working together, and not having to worry about everybody being exactly in sync. Most Agile environments are just a really fast waterfall. Maybe one out of 10 Agile environments I’ve seen is actually Agile. So, as we move up the chain, you know, you can’t be predicated on everybody getting their work in on time. So, the microservices especially, allow you to do some of the cool stuff that we’ve all talked about but haven’t yet fully actualized.”
Luontola focused on the impact on the development side. “If your system is a database or some other external data dependence, just one command, you have the dev environment running –it’s like, okay, here are some other teams that produce applications if they run on Docker. But even in these simple projects, where you just have a database and stuff. Even there, Docker is a benefit. And I also have some open source projects that I’m maintaining, and in one of them, it’s called Retrolambda. I run the tests against Java 5, 6, 7, 8, and 9. So, it’s nice when I can just have the one container that has all of the environment set up.”
Highlighting the cost benefits of containers, Wallgren added more. “It’s almost a necessary response, in some ways, to get the overhead of running a service down on a given piece of hardware. Once you get down to it, there’s still a CPU at the end of that somewhere. No matter how serverless you are, you’re not CPU-less. I’m hanging on to that. If you don’t do it architecturally, if you start doing microservices, then at some point your CFO or CMO is going to walk into your office and say, ‘Hey, how come we have all these really expensive VMs? Can we use something cheaper?’ Containers, you know, really is the only way to go a lot cheaper in the deployment footprint for that. From the cost or resource utilization perspective, that’s a pretty big one, too. At a scale, obviously.”
Challenges: Testing, Security, Monitoring...
Grabner gives a personal account of monitoring. “I think monitoring has been seeing a big change for us, obviously. Not only the way we monitor Docker container, how we get in there, but also what we do with the data and how we understand dependencies between the containers. Not only from the physical perspective, where they live and how they’re needing each other’s resources, but also the services that live in there, how they communicate end-to-end and how they, in the end, impact what is most important — the end user who is using the service base. That’s the problem that we try to solve.”
Haddad also taps into personal experience. “When you have a proliferation of containers, you have to be ready to apply testing, security, and monitoring at scale. It’s really easy to point to one server and one location, or one VM and one location and say, ‘I want to monitor that one thing, and I’m going to reach out and pull that machine for all the information, all the log files.’ One client, their mind just blew up. They thought it was the zombie apocalypse because their traditional monitoring tool was based on a reach out and pull, pull mentality. And they couldn’t get their heads around that, well, you don’t know where these containers are. You don’t know how many they are, so you need to push out the logs to a central aggregator service.”
Riley added to the monitoring conversation. “When you start deploying services across your application and you deploy the same service in different regions, even if you only have a handful of services per, the problem is, it gets big, but it’s not a new problem. This is not new at all. I mean, it becomes a big deal because of scale and volume, but we have the tools to solve this. And I think one of the things that some organizations are falling in the trap of is expecting their tools that they acquire, their monitoring tools just to suddenly deliver magic, and I think that’s part of the risk of using the term AI…You don’t build dashboards just to build dashboards. You build dashboards to consume it in some way. And to some respects, this is an information architecture problem that starts at your private repo. I think organizations that complain about monitoring are all the same organizations who have snowflake configurations and snowflake images, and don’t trade their containers as immutable. Visibility starts very early on. So that’s my point. I think this is a solvable solution. It’s a big deal. I know of three brand new vendors that are doing container native, microservices native monitoring. So you know the market’s out there.”
Luontola offers advice on testing. “One of the challenges about microservices is that, as the general advice from at least one or two years ago was don’t start with microservices. So, start with a monolith, and then, it’s so much faster and easier to develop when you have, let’s say, a maximum 10 people and working on the same code base. Then start to speed up things and you will need to start making all these monitoring and network stuff, and all the retries. Advice I’ve heard is to test how resilient your system is to all these failures and so on, make it so that it randomly duplicates messages or drops messages, and your system should survive that with no problems.
Dougherty adds insight into the importance of security. “The other thing is security when it comes to layers of your containers being… Luckily, we have a lot of startups that have come out, and Docker itself is doing a lot of work around this when it comes to trusted registry stuff and being able to ensure that we have consistency in the layers of our containers. Because, people are going out and they’re picking a base image, and they’re building all their stuff off of it, but, something can happen upstream from them that poisons everything they’ve done. So, we’re getting a lot of benefits when it comes to deployments and scalability and putting more power in the hands of developers.”
Referencing the shift left, Wallgren added more. “We have to sort of shift left also, not just as we were talking about earlier, kind of monitoring that we do, but deployments. If the first time you’re exercising your deployment functionality is when you go into production, that’s not as prevalent with containers because good luck doing a deployment of containers manually, right? I mean, most of us are using some form of automation. But what is the fidelity of that process vis-à-vis the fidelity when you go into production. I mean, is it different? When something breaks while you’re testing, do you deal with it the same way when it breaks in production? If you don’t, why not? And I’m sure there are legitimate reasons to treat them differently, but, you know, you should know them and understand them.”
Best Practices for Docker and Microservices at Scale
Grabner emphasizes the importance of stopping bad code changes early. “Monitoring is great in production to know that you messed up. But it’s even better in a pipeline to actually figure out what does the impact of my code change potentially have? So if I make a code change to a service and I actually know this service is going to be used by 80% of our users because it’s a number one service that everyone uses. And then, oh, I’m going to change it in a way that I’m causing 20% more CPU cycles, 50% point directions with an external service, then I should think twice before I push it into the next life cycle phase. So what we actually promote is if you test your services, if you do your unit test, your integration test, your REST API test, to only look at performance and response time and like a functionality, but look at key resource metrics, resource consumption.”
Haddad on what he tells his clients. “I’m telling them that they should take the time to understand the best practices and know that many times, the best practices don’t come for free. So, we mentioned that a container is just a process. Well, in the olden days, when we were just on a Linux machine, we could actually associate a user that that process would run under. It’s very easy to run Apache HTTPD under the DubDub data process. With containers, baking in lease privilege and a set of user credentials or, you know, cloud credentials is difficult. So that’s going to take some time. It’s predicated on having a running cluster. As a panelist said, just bootstrapping up a cluster is difficult. You have to understand cloud. You have to understand container infrastructure at scale.”
Riley offered several case studies up as success examples. “One is the Dollar Shave Club. I think that they’re a great example of building an application all microservices driven. The thing is, they were born microservices. So, that’s a big deal. It wasn’t a transition. That, I’m going to use as an example of starting net new — brand new product, building it up from the bottom, microservices. Again, Autotrader, their approach is a little bit different. They put a lot of planning on how they’re going to deploy microservices, but they’re doing it in pieces. And another big one, because I think that this is embarrassing for anybody who’s not at least considering, is the GSA, which is this government organization responsible for all contract procurement for everything, down to pencils up to airplanes and buildings. They have a full-blown application development system based on microservices and containers. And I think if they can do it and all they knew was waterfall – then, you have to believe you can, too.”
Luontola added his best practices advice. “Start small before scaling up, that’s okay. First, one approach at a time, one service they are located. If it goes well, then maybe two projects and get those working well with continuous delivery and all the best practices before trying to do it all.”
Dougherty advises careful planning ahead of time. “I think it’s important not to jump right into doing this stuff just because it’s the cool, hip thing to do. You have to have some kind of goal and benefit that you’re trying to get to. You have to have kind of an end game, and think about it ahead of time, kind of as Chris said. And make sure that you’re not just diving into it head first, and make sure that you thought through the architecture that you’re going to deploy to. You thought through the pipeline you’re going to have as you move through staging, user acceptance testing, and the production. Make sure that you thought through security. And actually, in some cases, your application might not really be cut out to run in this type of environment, and that’s okay.”
Wallgren offers a personal example that emphasizes testing. “I think monitoring the entire testing is pretty important, and I’ll give you a very simple trivial example, which is to monitor your logs during unit testing. We – many, many years ago – had an example where we had a production problem. It turns out the production problem was happening during unit testing as well, but it was just logging problems. It wasn’t throwing exceptions. So, the test passed, right? We had a rule that says during unit testing, we monitor the logs. If there are any unaccounted for warnings or errors in the log — that’s a problem. That fails the test. And we had to learn that lesson the hard way.”
Fell also offers insight. “Don’t just let the developers throw something in and say, ‘Okay, you’ve got to run this whole cluster of stuff in production,’ if you don’t know what that thing is doing or how to help prevent a problem or fix a problem when a problem occurs. Otherwise, then the fingers are going to be pointed squarely at you.”
Watch the full episode:
Published at DZone with permission of Anders Wallgren , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.