Dockerfile Optimization for Fast Builds and Light Images
Docker BuildKit can be used to optimize Dockerfiles and achieve significant gains in performance when building images, while reducing image size.
Join the DZone community and get the full member experience.Join For Free
"Docker builds images automatically by reading the instructions from a Dockerfile -- a text file that contains all commands, in order, needed to build a given image."
The explanation above was extracted from Docker’s official docs and summarizes what a Dockerfile is for. Dockerfiles are important to work with because they are our blueprint, our record of layers added to a Docker base image.
We will learn how to take advantage of BuildKit features, a set of enhancements introduced on Docker v18.09. Integrating BuildKit will give us better performance, storage management, and security.
- Decrease build time
- Reduce image size
- Gain maintainability
- Gain reproducibility
- Understand multi-stage Dockerfiles
- Understand BuildKit features
- Knowledge of Docker concepts
- Docker installed (currently using v19.03)
- A Java app (for this post, I used a sample Jenkins Maven app)
Let's get to it!
Simple Dockerfile Example
Below is an example of an unoptimized Dockerfile containing a Java app. This example was taken from this DockerCon conference talk. We will walk through several optimizations as we go.
Here, we may ask ourselves: how long does it take to build at this stage? To answer it, let's create this Dockerfile on our local development computer and tell Docker to build the image.
0,21s user 0,23s system 0% cpu 1:55,17 total
Here’s our answer: our build takes 1m55s at this point.
But what if we just enable BuildKit with no additional changes? Does it make a difference?
BuildKit can be enabled with two methods:
- Setting the
DOCKER_BUILDKIT=1environment variable when invoking the Docker build command such as:
- Enabling Docker BuildKit by default, setting the daemon configuration in the
true, and restarting the daemon:
BuildKit initial impact:
0,54s user 0,93s system 1% cpu 1:43,00 total
On the same hardware, the build took ~12 seconds less than before. This means the build got ~10.43% faster with almost no effort.
But now let’s look at some extra steps we can take to improve our results even further.
Order From Least to Most Frequently Changing
Because order matters for caching, we'll move the
COPY command closer to the end of the Dockerfile.
Avoid "COPY ."
Opt for more specific
COPY arguments to limit cache busts. Only copy what’s needed.
Couple apt-get update and install
This prevents using an outdated package cache. Cache them together or do not cache them at all.
Remove Unnecessary Dependencies
Don’t install debugging and editing tools — you can install them later when you feel you need them.
Remove Package Manager Cache
Your image does not need this cache data. Take the chance to free some space.
Use Official Images Where Possible
There are some good reasons to use official images, such as reducing the time spent on maintenance and reducing the size, as well as having an image that is pre-configured for container use.
Use Specific Tags
latest as it’s a rolling tag. That’s asking for unpredictable problems.
Look for Minimal Flavors
You can reduce the base image size. Pick the lightest one that suits your purpose. Below is a short
openjdk images list.
Build From a Source in a Consistent Environment
Maybe you do not need the whole JDK. If you intended to use JDK for Maven, you can use a Maven Docker image as a base for your build.
Fetch Dependencies in a Separate Step
A Dockerfile command to fetch dependencies can be cached. Caching this step will speed up our builds.
Multi-Stage Builds: Remove Build Dependencies
Why use multi-stage builds?
- Separate the build from the runtime environment
- Different details on dev, test, lint specific environments
- Delinearizing dependencies (concurrency)
- Having platform-specific stages
If you build our application at this point...
0,41s user 0,54s system 2% cpu 35,656 total
...you'll notice our application takes ~35.66 seconds to build. It's a pleasant improvement. From now on, we will focus on the features for more possible scenarios.
Multi-Stage Builds: Different Image Flavors
The Dockerfile below shows a different stage for a Debian and an Alpine based image.
To build a specific image on a stage, we can use the
Different Image Flavors (DRY/Global ARG)
ARG command can control the image to be built. In the example above, we wrote
alpine as the default flavor, but we can pass
--build-arg flavor=<flavor> on the
docker build command.
Concurrency is important when building Docker images as it takes the most advantage of available CPU threads. In a linear Dockerfile, all stages are executed in sequence. With multi-stage builds, we can have smaller dependency stages be ready for the main stage to use them.
BuildKit even brings another performance bonus. If stages are not used later in the build, they are directly skipped instead of processed and discarded when they finish. This means that in the stage graph representation, unneeded stages are not even considered.
Below is an example Dockerfile where a website's assets are built in an
And here is another Dockerfile where C and C++ libraries are separately compiled and take part in the
builder stage later on.
BuildKit Application Cache
BuildKit has a special feature regarding package managers cache. Here are some examples of cache folders typical locations:
We can compare this Dockerfile with the one presented in the section Build from the source in a consistent environment. This earlier Dockerfile didn't have special cache handling. We can do that with a type of mount called cache:
BuildKit Secret Volumes
To mix in some security features of BuildKit, let's see how secret type mounts are used and some cases they are meant for. The first scenario shows an example where we need to hide a secrets file, like
To build this Dockerfile, pass the
--secret argument like this:
The second scenario is a method to avoid commands like
COPY ./keys/private.pem /root .ssh/private.pem, as we don't want our SSH keys to be stored on the Docker image after they are no longer needed. BuildKit has an
ssh mount type to cover that:
To build this Dockerfile, you need to load your private SSH key into your
ssh-agent and add
default representing the SSH private key location.
This concludes our demo on using Docker BuildKit to optimize your Dockerfiles and consequentially speed up your image build time.
These speed gains result in much-needed savings in time and computational power, which should not be neglected.
Like Charles Duhigg wrote in The Power of Habit:
"Small victories are the consistent application of a small advantage."
You will definitely reap the benefits if you build good practices and habits.
Published at DZone with permission of Rui Trigo. See the original article here.
Opinions expressed by DZone contributors are their own.