Speed Up Multi-Stage Docker Builds in CI/CD With Buildkit’s Registry Cache
Join the DZone community and get the full member experience.Join For Free
Working on a GitOps framework around Kubernetes, I naturally run everything in containers. The two Dockerfiles that matter most for me, unfortunately, both have to download a lot of dependencies not included in the repository at build time. This means the layer cache is crucial. Unfortunately, using ephemeral CI/CD runners — GitHub Actions in my case — means each run starts with an empty cache.
The first of the two Dockerfiles build the image for the framework itself. This image is used for bootstrapping, automation runs, and disaster recovery. As such, it’s not your run-of-the-mill Dockerfile. It includes a number of dependencies installed from Debian packages, various Go binaries, and last but not least, the Python-based CLIs of AWS, Azure, and Google Cloud.
It makes heavy use of multi-stage-builds and has different build stages for common dependencies and each Cloud provider’s specific dependencies. The layers of the final image also mirror the build stage logic.
Dockerfile number two is for the Kubestack website itself. The site is built using Gatsby and has to download a lot of node modules during its build. The Dockerfile is optimized for cache-ability and uses multi-stage builds to have a build environment based on NodeJS and a final image based on Nginx to serve the static build.
Build time for both, the framework image and the website image heavily benefit from having a layer cache.
Docker has had the ability to use an image as the build cache using the
--cache-from parameter for some time. This was my preferred option because I need the ability to build and push images anyway. Storing the cache alongside the image is a no-brainer in my opinion.
For the website image, the first step of my CI/CD pipeline is to pull the cache image. Please note the
|| true at the end to ensure a missing cache doesn’t prevent my build from running.
Step two runs a build targeting the dev stage of my multi-stage Dockerfile and tags the result as the new build-cache.
The next step runs the actual build that produces the final image and tags it as well.
Finally, the pipeline pushes both images.
For a simple, multi-stage build with only two stages, like my Gatsby website’s Dockerfile, this works pretty well.
But, when I tried this for a project with multiple build stages, one for Python and one for JS, specifying two images under
--cache-from, it never seemed to work reliably. This is doubly unfortunate because having a layer cache here would save time not downloading Python and JS dependencies on every run.
Having cache pull and cache build steps for every stage also makes for a growingly verbose pipeline file the more stages you have.
So, for the framework Dockerfile, I needed something better.
Enter buildkit. Buildkit brings a number of improvements to container image building. The ones that won me over are:
- Running build stages concurrently.
- Increasing cache-efficiency.
- Handling secrets during builds.
Apart from generally increasing cache efficiency, it also allows more control over caches when building with
buildctl. This is what I needed. Buildkit has three options for exporting the cache, called inline, registry, and local. Local is not particularly interesting in my case, but would allow writing the cache to a directory. Inline includes the cache in the final image and pushes the cache and image to the registry layers together.
But this only includes the cache for the final stage in multi-stage builds. Finally, the registry option does allow pushing all cached layers of all stages into a separate image. This is what I needed for my framework Dockerfile.
Let’s take a look at how I’m using this in my pipeline. Having the cache export and import included in buildkit means I can reduce the three steps into one. And it also is just one step, no matter how many stages my Dockerfile has.
This one command handles pulling and importing the cache, building the image, exporting the cache, and pushing the image and the cache. By running the build inside a container, I also don’t have to worry about installing the buildkit daemon and CLI. The only thing I needed to do was providing the
.docker/config to the build inside the container to be able to push the image and the cache to the registry.
For a working example, take a look at the Kubestack release automation pipeline on Github.
Using the cache, the framework image builds in less than one minute — down from about three minutes before using buildkit without the cache export and import.
Published at DZone with permission of Philipp Strube. See the original article here.
Opinions expressed by DZone contributors are their own.