5 Common Mistakes When Writing Docker Compose
Learn some best practices for Docker Compose files to improve developer productivity.
Join the DZone community and get the full member experience.
Join For FreeWhen building a containerized application, developers need a way to boot containers they’re working on to test their code. While there are several ways to do this, Docker Compose is one of the most popular options. It makes it easy to:
- Specify what containers to boot during development
- And setup a fast code-test-debug development loop
The vision is that someone writes a docker-compose.yml
that specifies everything that’s needed in development and commits it to their repo. Then, every developer simply runs docker-compose up
, which boots all the containers they need to test their code.
However, it takes a lot of work to get your docker-compose
setup to peak performance. We’ve seen the best teams booting their development environments in less than a minute and testing each change in seconds.
Given how much time every developer spends testing their code every day, small improvements can add up to a massive impact on developer productivity.
Mistake 1: Frequent Container Rebuilds
docker build
takes a long time. If you’re rebuilding your container every time you want to test a code change, you have a huge opportunity to speed up your development loop.
The traditional workflow for working on non-containerized applications looks like this:
- Code
- Build
- Run
This process has been highly optimized over the years, with tricks like incremental builds for compiled languages and hot reloading. It’s gotten pretty fast.
When people first adopt containers, they tend to take their existing workflow and just add a docker build
step. Their workflow ends up like this:
- Code
- Build
- Docker Build
- Run
If not done well, that docker build
step tosses all those optimizations out the window. Plus, it adds a bunch of additional time-consuming work like reinstalling dependencies with apt-get. All of this adds up into a much slower test process than we had before Docker.
Solution: Run your code outside of Docker
One approach is to boot all your dependencies in Docker Compose, but run the code you’re actively working on locally. This mimics the workflow for developing non-containerized applications.
Just expose your dependencies over localhost
and point the service you’re working on at the localhost:<port>
addresses.
However, this is not always practical, particularly if the code you’re working on depends on things built into the container image that aren’t easy to access from your laptop.
Solution: Maximize caching to optimize your Dockerfile
If you must build Docker images, writing your Dockerfiles so that they maximize caching can turn a 10 minute Docker build into 1 minute.
A typical pattern for production Dockerfiles is to reduce the number of layers by chaining single commands into one RUN
statement. However, image size doesn’t matter in development. In development, you want the most layers possible.
Your production Dockerfile might look like this:
xxxxxxxxxx
RUN \
go get -d -v \
&& go install -v \
&& go build
This is terrible for development because every time that command is re-run, Docker will re-download all of your dependencies and reinstall them. An incremental build is more efficient.
Instead, you should have a dedicated Dockerfile specifically for development. Break everything into tiny little steps, and plan your Dockerfile so that the steps based on code that changes frequently come last.
The stuff that changes least frequently, like pulling dependencies, should go first. This way, you don’t have to build the entire project when rebuilding your Dockerfile. You just have to build the tiny last piece you just changed.
For an example of this, see below the Dockerfile we use for Blimp development. It follows the techniques described above to shrink a heavy build process down to a couple of seconds.
xxxxxxxxxx
FROM golang:1.13-alpine as builder
RUN apk add busybox-static
WORKDIR /go/src/github.com/kelda-inc/blimp
ADD ./go.mod ./go.mod
ADD ./go.sum ./go.sum
ADD ./pkg ./pkg
ARG COMPILE_FLAGS
RUN CGO_ENABLED=0 go install -i -ldflags "${COMPILE_FLAGS}" ./pkg/...
ADD ./login-proxy ./login-proxy
RUN CGO_ENABLED=0 go install -i -ldflags "${COMPILE_FLAGS}" ./login-proxy/...
ADD ./registry ./registry
RUN CGO_ENABLED=0 go install -i -ldflags "${COMPILE_FLAGS}" ./registry/...
ADD ./sandbox ./sandbox
RUN CGO_ENABLED=0 go install -i -ldflags "${COMPILE_FLAGS}" ./sandbox/...
ADD ./cluster-controller ./cluster-controller
RUN CGO_ENABLED=0 go install -i -ldflags "${COMPILE_FLAGS}" ./cluster-controller/...
RUN mkdir /gobin
RUN cp /go/bin/cluster-controller /gobin/blimp-cluster-controller
RUN cp /go/bin/syncthing /gobin/blimp-syncthing
RUN cp /go/bin/init /gobin/blimp-init
RUN cp /go/bin/sbctl /gobin/blimp-sbctl
RUN cp /go/bin/registry /gobin/blimp-auth
RUN cp /go/bin/vcp /gobin/blimp-vcp
RUN cp /go/bin/login-proxy /gobin/login-proxy
FROM alpine
COPY --from=builder /bin/busybox.static /bin/busybox.static
COPY --from=builder /gobin/* /bin/
One final note: with the recent introduction of multi-stage builds, it’s now possible to create Dockerfiles that both have good layering and small images sizes. We won’t discuss this in much detail in this post, other than to say that the Dockerfile shown above does just that and as a result is used both for Blimp development as well as production.
Solution: Use host volumes
In general, the best option is to use a host volume to directly mount your code into the container. This gives you the speed of running your code natively, while still running in the Docker container containing its runtime dependencies.
Host volumes mirror a directory on your laptop into a running container. When you edit a file in your text editor, the change is automatically synced into the container and then can be immediately executed within the container.
Most languages have a way to watch your code, and automatically re-run when it changes. For example, nodemon is the go to for Javascript. Check out this post for a tutorial on how to set this up.
It takes some work initially, but the result is that you can see the results of your code changes in 1-2 seconds, versus a Docker build which can take minutes.
Mistake 2: Slow Host Volumes
If you’re already using host volumes, you may have noticed that reading and writing files can be painfully slow on Windows and Mac. This is a known issue for commands that read and write lots of files, such as Node.js and PHP applications with complex dependencies.
This is because Docker runs in a VM on Windows and Mac. When you do a host volume mount, it has to go through lots of translation to get the folder running on your laptop into the container, somewhat similar to a network file system. This adds a great deal of overhead, which isn’t present when running Docker natively on Linux.
Solution: Relax strong consistency
One of the key problems is that file-system mounts by default maintain strong consistency. Consistency is a broad topic on which much ink has been spilt, but in short it means that all of a particular files reader’s and writers agree on the order that any file modifications occurred, and thus agree on the contents of that file (eventually, sort of).
The problem is, enforcing strong consistency is quite expensive, requiring coordination between all of a files writers to guarantee they don’t inappropriately clobber each other’s changes.
While strong consistency can be particularly important when, for example, running a database in production. The good news is that in development, it’s not required. Your code files are going to have a single writer (you), and a single source of truth (your repo). As a result, conflicts aren’t as big a concern as they might be in production.
For just this reason, Docker implemented the ability to relax consistency guarantees when mounting volumes. In Docker Compose, you can simply add this cached
keyword to your volume mounts to get a significant performance guarantee. (Don’t do this in production …)
xxxxxxxxxx
volumes:
- "./app:/usr/src/app/app:cached"
Solution: Code syncing
Another approach is to setup code syncing. Instead of mounting a volume, you can use a tool that notices changes between your laptop and the container and copies files to resolve the differences (similar to rsync).
The next version of Docker has Mutagen built in as an alternative to cached mode for volumes. If you’re interested, just wait until Docker makes its next release and try that out, but you can also check out the Mutagen project to use it without waiting. Blimp, our Docker Compose implementation, achieves something similar using Syncthing
Solution: Don’t mount packages
With languages like Node, the bulk of file operations tend to be in the packages directory (like node_modules
). As a result, excluding these directories from your volumes can cause a significant performance boost.
In the example below, we have a volume mounting our code into a container. And then overwrite just the node_modules
directory with its own clean dedicated volume.
xxxxxxxxxx
volumes:
- ".:/usr/src/app"
- "/usr/src/app/node_modules"
This additional volume mount tells Docker to use a standard volume for the node_modules
directory so that when npm install
runs it doesn’t use the slow host mount. To make this work, when the container first boots up we do npm install
in the entrypoint
to install our dependencies and populate the node_modules
directory. Something like this:
xxxxxxxxxx
entrypoint:
- "sh"
- "-c"
- "npm install && ./node_modules/.bin/nodemon server.js"
Full instructions to clone and run the above example can be found here.
Mistake 3: Brittle Configuration
Most Docker Compose files evolve organically. We typically see tons of copy and pasted code, which makes it hard to make modifications. A clean Docker Compose file makes it easier to make regular updates as production changes.
Solution: Use env files
Env files separate environment variables from the main Docker Compose configuration. This is helpful for:
- Keeping secrets out of the git history
- Making it easy to have slightly different settings per developer. For example, each developer may have a unique access key. Saving the configuration in a
.env
file means that they don’t have to modify the committeddocker-compose.yml
file, and deal with conflicts as the file is updated.
To use env files, just add a .env
file, or set the path explicitly with the env_file
field.
Solution: Use override files
Override files let you have a base configuration, and then specify the modifications in a different file. This can be really powerful if you use Docker Swarm, and have a production YAML file. You can store your production configuration in docker-compose.yml
, then specify any modifications needed for development, such as using host volumes, in an override file.
Solution: Use extends
If you’re using Docker Compose v2, you can use the extends
keyword to import snippets of YAML in multiple places. For example, you might have a definition that all services at your company will have these particular five configuration options in their Docker Compose file in development. You can define that once, and then use the extends
keyword to drop that everywhere it’s needed, which gives you some modularity. It’s painful that we have to do this in YAML but it’s the best we have short of writing a program to generate it.
Compose v3 removed support for the extends
keyword. However, you can achieve a similar result with YAML anchors.
Solution: Programmatically generate Compose files
We’ve worked with some engineering teams using Blimp that have a hundred containers in their development Docker Compose file. If they were to use a single giant Docker Compose file it would require thousands of lines of unmaintainable YAML.
As you scale, it’s okay to write a script to generate Docker Compose files based on some higher-level specifications. This is common for engineering teams with really large development environments.
Mistake 4: Flaky Boots
Does docker-compose up
only work half the time? Do you have to run docker-compose restart
to bring up crashed services?
Most developers want to write code, not do DevOps work. Debugging a broken development environment is super frustrating.
docker-compose up
should just work, every single time.
Most of the issues here are related to services starting in the wrong order. For example, your web application may rely on a database, and will crash if the database isn’t ready when the web application boots.
Solution: Use depends_on
depends_on
lets you control startup order. By default, depends_on
only waits until the dependency is created, and doesn’t wait for the dependency to be “healthy”. However, Docker Compose v2 supports combining depends_on with healthchecks. (Unfortunately, this feature was removed in Docker Compose v3, instead you can manually implement something similar with a script like wait-for-it.sh)
The Docker documentation recommends against approaches like depends_on
and wait-for-it.sh
. And we agree, in production, requiring a specific boot order for your containers is a sign of a brittle architecture. However, as an individual developer trying to get your job done, fixing every single container in the entire engineering organization may not be feasible. So, for development, we think it’s OK.
Mistake 5: Poor Resource Management
It can get tricky to make sure that Docker has the resources it needs to run smoothly, without completely overtaking your laptop. There’s a couple things you can look into if you feel like your development workflow is sluggish because Docker isn’t running at peak capacity.
Solution: Change Docker Desktop allocations
Docker Desktop needs a lot of RAM and CPU, particularly on Mac and Windows where it’s a VM. The default Docker Desktop configuration tends not to allocate enough RAM and CPU, so we generally recommend tweaking the setting to over-allocate. I tend to allocate about 8GB of RAM and 4 CPUs to Docker when I’m developing (and I turn Docker Desktop off when not in use to make that workable).
Solution: Prune unused resources
Frequently people will unintentionally leak resources when using Docker. It’s not uncommon for folks to have hundreds of volumes, old container images, and sometimes running containers if they’re not careful. That’s why we recommend occasionally running docker system prune
which deletes all of the volumes, containers, and networks that aren’t currently being used. That can free up a lot of resources.
Solution: Run in the cloud
Finally, in some cases even with the above tips, it may be impossible to efficiently run all of the containers you need on your laptop. If that’s the case, check out Blimp, an easy way to run Docker Compose files in the cloud.
What Should You Do?
TDLR; To improve the developer experience on Docker Compose, I’d encourage you to
- Minimize container rebuilds.
- Use host volumes.
- Strive for maintainable compose files, just like code.
- Make your boots reliable.
- Manage resources mindfully.
---
See a tutorial on setting up host volumes for faster Docker development
Read what a mysterious bug taught us about how Docker stores registry credentials
Check out Blimp, our team’s project to improve developer productivity for Docker Compose.
Published at DZone with permission of Ethan J Jackson. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments