Expert Techniques to Trim Your Docker Images and Speed Up Build Times
Use -slim base images, multi-stage builds, smart layer caching, and chained RUN commands to build lean, fast, and production-ready Docker images.
Join the DZone community and get the full member experience.
Join For FreeKey Takeaways
- Pick your base image like you're choosing a foundation for your house. Going with a minimal variant like
python-slimor a runtime-specific CUDA image, is hands down the quickest way to slash your image size and reduce security risks. - Multi-stage builds are your new best friend for keeping things organized. Think of it like having a messy workshop (your "builder" stage) where you do all the heavy lifting with compilers and testing tools, then only moving the finished product to your clean showroom (the "runtime" stage).
- Layer your Dockerfile with caching in mind, always. Put the stuff that rarely changes (like dependency installation) before the stuff that changes all the time (like your app code). This simple trick can cut your build times from minutes to mere seconds.
- Remember that every
RUNcommand creates a permanent layer. You've got to chain your installation and cleanup commands together with&&to make sure temporary files actually disappear within the same layer. Otherwise, you're just hiding a mess under the rug while still paying for the storage. - Stop treating
.dockerignorelike an afterthought. Make it your first line of defense to keep huge datasets, model checkpoints, and (yikes!) credentials from ever getting near your build context.
So you've built your AI model, containerized everything, and hit docker build. The build finishes, and there it is: a multi-gigabyte monster staring back at you. If you've worked with AI containers, you know this pain. Docker's convenience comes at a price, and that price is bloated, sluggish images that slow down everything from developer workflows to CI/CD pipelines while burning through your cloud budget.
This guide isn't just another collection of Docker tips. We're going deep into the fundamental principles that make containers efficient. We'll tackle both sides of the optimization coin:
- The Architecture: Making smart choices about base images and how you structure your builds.
- The Mechanics: Getting your hands dirty with layers, caching, and cleanup techniques.
To keep things real, we'll work through an actual example: a text classification app using BERT. We'll take this beast from a 2.37GB container that takes forever to build down to a slim 720MB image that rebuilds in 25 seconds. Let's dive in.
The Starting Point: Diagnosing a 2.37GB AI Image
Our starting project, naive_image, works fine, but it's definitely not winning any optimization awards. A quick build tells the whole story: we're looking at a 2.37GB image that takes 56 seconds to build on an Apple M1 Max. Ouch.
Dockerfile (from naive_image):
# naive_image/Dockerfile
# This is the initial, naive Dockerfile.
# It aims to be simple and functional, but NOT optimized for size or speed.
# Use a standard, general-purpose Python image.
FROM python:3.10
RUN apt-get update && apt-get install -y curl
# Set the working directory inside the container
# All subsequent commands will run from this directory
WORKDIR /app
# Copy requirements first for better layer caching
COPY naive_image/requirements.txt ./requirements.txt
# Install all dependencies listed in requirements.txt.
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code and data
COPY naive_image/app/ ./app/
COPY naive_image/sample_data/ ./sample_data/
RUN echo "Build complete" > /app/build_status.txt
# Command to run the application when the container starts.
# This runs the predictor script with the sample text file.
CMD ["python", "app/predictor.py", "sample_data/sample_text.txt"]
Running docker history immediately shows us what's going wrong. We've got two main offenders here: that chunky python:3.10 base image and a massive pip install layer that's adding about 1.5GB all by itself. It's installing everything and the kitchen sink. Now that we know what we're dealing with, let's fix it.
docker history bert-classifier-naive
You'll see something like this:
IMAGE CREATED CREATED BY SIZE COMMENT
b0693be54230 2 minutes ago CMD ["python" "app/predictor.py" "sample_dat… 0B buildkit.dockerfile.v0
<missing> 2 minutes ago RUN /bin/sh -c echo "Build complete" > /app/… 15B buildkit.dockerfile.v0
<missing> 2 minutes ago COPY naive_image/sample_data/ ./sample_data/… 376B buildkit.dockerfile.v0
<missing> 2 minutes ago COPY naive_image/app/ ./app/ # buildkit 12.2kB buildkit.dockerfile.v0
<missing> 2 minutes ago RUN /bin/sh -c pip install --no-cache-dir -r… 1.51GB buildkit.dockerfile.v0
<missing> 3 minutes ago COPY naive_image/requirements.txt ./requirem… 362B buildkit.dockerfile.v0
<missing> 3 minutes ago WORKDIR /app 0B buildkit.dockerfile.v0
<missing> 3 minutes ago RUN /bin/sh -c apt-get update && apt-get ins… 19.4MB buildkit.dockerfile.v0
<missing> 3 weeks ago CMD ["python3"] 0B buildkit.dockerfile.v0
<missing> 3 weeks ago RUN /bin/sh -c set -eux; for src in idle3 p… 36B buildkit.dockerfile.v0
<missing> 3 weeks ago RUN /bin/sh -c set -eux; wget -O python.ta… 58.2MB buildkit.dockerfile.v0
<missing> 3 weeks ago ENV PYTHON_SHA256=4c68050f049d1b4ac5aadd0df5… 0B buildkit.dockerfile.v0
<missing> 3 weeks ago ENV PYTHON_VERSION=3.10.17 0B buildkit.dockerfile.v0
<missing> 3 weeks ago ENV GPG_KEY=A035C8C19219BA821ECEA86B64E628F8… 0B buildkit.dockerfile.v0
<missing> 3 weeks ago RUN /bin/sh -c set -eux; apt-get update; a… 18.2MB buildkit.dockerfile.v0
<missing> 3 weeks ago ENV LANG=C.UTF-8 0B buildkit.dockerfile.v0
<missing> 3 weeks ago ENV PATH=/usr/local/bin:/usr/local/sbin:/usr… 0B buildkit.dockerfile.v0
<missing> 16 months ago RUN /bin/sh -c set -ex; apt-get update; ap… 560MB buildkit.dockerfile.v0
<missing> 16 months ago RUN /bin/sh -c set -eux; apt-get update; a… 183MB buildkit.dockerfile.v0
<missing> 2 years ago RUN /bin/sh -c set -eux; apt-get update; a… 48.5MB buildkit.dockerfile.v0
Part 1: Blueprint for Efficiency: Base Images and Multi-Stage Builds
Before we start tweaking the small stuff, let's fix the big architectural issues.
A. The Quick Win: Slim Base Images
Think of your FROM instruction as choosing the foundation for a house. Pick a heavy foundation, and you're stuck with a heavy house. The standard python:3.10 image comes with the full Debian experience, complete with development tools and libraries you'll never need in production. Our first move? Switch to something leaner.
Check out what happens when we make this one simple change in our slim_image project's Dockerfile:
# slim_image/Dockerfile
FROM python:3.10-slim
WORKDIR /app
COPY ./requirements.txt ./
COPY ./app/ ./app/
COPY ./sample_data/ ./sample_data/
RUN pip install --no-cache-dir -r requirements.txt
CMD ["python", "app/predictor.py", "sample_data/sample_text.txt"]
Just this one line change and boom: we go from 2.54GB to 1.66GB, and build time drops from 56s to 51s. Running docker history shows exactly why this works. The python:3.10-slim base is hundreds of megabytes smaller. The same principle applies if you're doing GPU work: always go for the lean nvidia/cuda:<version>-runtime image instead of the bloated nvidia/cuda:<version>-devel for production.
docker history bert-classifier-slim
Here's what you'll see:
IMAGE CREATED CREATED BY SIZE COMMENT
4633330c13b5 9 seconds ago CMD ["python" "app/predictor.py" "sample_dat… 0B buildkit.dockerfile.v0
<missing> 9 seconds ago RUN /bin/sh -c pip install --no-cache-dir -r… 1.34GB buildkit.dockerfile.v0
<missing> 58 seconds ago COPY slim_image/sample_data/ ./sample_data/ … 376B buildkit.dockerfile.v0
<missing> 58 seconds ago COPY slim_image/app/ ./app/ # buildkit 5.51kB buildkit.dockerfile.v0
<missing> 58 seconds ago COPY slim_image/requirements.txt ./requireme… 334B buildkit.dockerfile.v0
<missing> 58 seconds ago WORKDIR /app 0B buildkit.dockerfile.v0
<missing> 5 weeks ago CMD ["python3"] 0B buildkit.dockerfile.v0
<missing> 5 weeks ago RUN /bin/sh -c set -eux; for src in idle3 p… 36B buildkit.dockerfile.v0
<missing> 5 weeks ago RUN /bin/sh -c set -eux; savedAptMark="$(a… 46.4MB buildkit.dockerfile.v0
<missing> 5 weeks ago ENV PYTHON_SHA256=ae665bc678abd9ab6a6e1573d2… 0B buildkit.dockerfile.v0
<missing> 5 weeks ago ENV PYTHON_VERSION=3.10.18 0B buildkit.dockerfile.v0
<missing> 5 weeks ago ENV GPG_KEY=A035C8C19219BA821ECEA86B64E628F8… 0B buildkit.dockerfile.v0
<missing> 5 weeks ago RUN /bin/sh -c set -eux; apt-get update; a… 9.18MB buildkit.dockerfile.v0
<missing> 5 weeks ago ENV LANG=C.UTF-8 0B buildkit.dockerfile.v0
<missing> 5 weeks ago ENV PATH=/usr/local/bin:/usr/local/sbin:/usr… 0B buildkit.dockerfile.v0
<missing> 5 weeks ago # debian.sh --arch 'arm64' out/ 'bookworm' '… 97.2MB debuerreotype 0.15
B. The Architectural Leap: Isolating Environments With Multi-Stage Builds
Okay, our image is smaller, but let's be honest: is it really production-ready? Take a peek at our dependencies. We've got the essentials like transformers and torch, but also a bunch of dev tools like pytest, black, and jupyter. These are great for development, but in production? They're just dead weight and potential security holes. Shipping dev tools to production is like bringing your entire toolbox when you only need a screwdriver.
This is where multi-stage builds come to the rescue.
Here's the mental model: imagine building a piece of furniture. You do all the messy work in your garage workshop, which is full of saws, drills, and sawdust. Once you're done, you move only the finished furniture to your living room. The garage stays messy, but who cares? Your living room is pristine.
In Docker terms, your "builder" stage is that messy garage. You can install compilers, testing frameworks, whatever you need. Then, in your "runtime" stage, you start fresh and cherry-pick only the finished pieces you actually need. When the build completes, Docker throws away the entire garage.
Here's how it actually works:
FROM python:3.10 AS builder: This creates your workshop and gives it a name.- Inside this stage, go wild. Install everything, run tests with RUN pytest, whatever you need.
FROM python:3.10-slim AS runtime: Start fresh with a clean stage.COPY --from=builder <source> <destination>: This is where the magic happens. You can selectively grab stuff from your builder stage.
Let's see this in action with our multistage_image project's Dockerfile:
# multistage_image/Dockerfile
# ====== BUILD STAGE ======
FROM python:3.10 AS builder
WORKDIR /app
COPY multistage_image/requirements.txt runtime_requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt # Installs ALL deps, including pytest
# You could run tests here: RUN pytest
# ====== RUNTIME STAGE ======
FROM python:3.10-slim AS runtime
WORKDIR /app
COPY multistage_image/runtime_requirements.txt ./
RUN pip install --no-cache-dir -r runtime_requirements.txt # Installs ONLY runtime deps
COPY multistage_image/app/ ./app/
COPY multistage_image/sample_data/ ./sample_data/
CMD ["python", "app/predictor.py", "sample_data/sample_text.txt"]
The results speak for themselves: we're down to 827MB and builds take just 24 seconds. We've eliminated hundreds of megabytes of dev-only Python packages. This pattern really shines when you're dealing with compiled dependencies too, since compilers like gcc or nvcc can stay in the builder stage where they belong.
Now, you might be wondering: "Wait, why aren't we using COPY --from=builder for the Python packages? Why run pip install twice?" Great question!
This Dockerfile uses what I call the "Re-install from Lockfile" pattern:
- Why do this? We want to guarantee that our final image contains only the packages in
runtime_requirements.txt. No chance of a random dev dependency sneaking in. - The good: It's crystal clear what's happening, easy to audit, and your production dependencies are pristine. The builder stage acts like a CI check, making sure everything plays nicely together.
- The not-so-good: You need network access in the final stage to hit PyPI again.
The Alternative: The "Build and Copy Artifacts" Pattern
But what if you've got a custom C extension that needs compiling? The "Re-install from Lockfile" pattern won't work because pip install in the runtime stage won't have your compiled goodies from the builder.
That's when you'd use a more direct approach:
# Alternative "Build and Copy Artifacts" Pattern
# ====== BUILD STAGE ======
FROM python:3.10 AS builder
WORKDIR /build-env
# Install build tools like gcc, cmake, etc.
RUN apt-get update && apt-get install -y build-essential
# Prepare a clean directory of only runtime packages
WORKDIR /runtime_packages
COPY runtime_requirements.txt .
# --target installs packages to a specified directory instead of the global site-packages
RUN pip install --no-cache-dir --target=. -r runtime_requirements.txt
# If you had a C extension:
# COPY ../my_c_extension ./src/my_c_extension
# RUN pip install --no-cache-dir --target=. ./src/my_c_extension
# ====== RUNTIME STAGE ======
FROM python:3.10-slim AS runtime
WORKDIR /app
# This is the "magic link" in action for packages.
# It copies the fully prepared dependencies directly from the builder.
COPY --from=builder /runtime_packages /usr/local/lib/python3.10/site-packages/
COPY ./app ./app
# ...
- Why do this? You get a completely self-contained build with zero network dependencies in the final stage, plus it handles compiled code.
- The good: It's hermetic and works for any compiled dependencies. The runtime stage builds super fast since it's just copying files.
- The tricky part: You need to be careful in the builder stage to keep that /runtime_packages directory clean.
Picking the Right Approach
For our multistage_image project, where we mainly wanted to keep pytest away from transformers, the "Re-install from Lockfile" pattern works perfectly. If we had compiled dependencies, we'd have no choice but to use the "Build and Copy Artifacts" pattern.
Knowing both patterns gives you options. And speaking of results, our multi-stage build brings us down to 827MB. Not bad!
Part 2: The Mechanical Tuning — Cache, Layers, and Context
Now that we've got the architecture sorted, let's get into the nitty-gritty of making builds blazing fast. Our final layered_image project shows how it's done.
A. Mastering the Build Cache (The Key to Fast Iteration)
Experiment 1: Experience the Cache Magic
First, let's build our properly structured layered_image. The first build takes about 23 seconds.
time docker build -t bert-classifier:layered -f layered_image/Dockerfile layered_image/
Now here's where it gets fun. Open layered_image/app/predictor.py and make a tiny change, like adding a comment. Run the exact same build command again. Watch this: it finishes in less than a second. You'll see Docker saying "Using cache" for that slow pip install step because its input (runtime_requirements.txt) didn't change. Only the final COPY runs again. That's the power of proper caching!
Experiment 2: How to Destroy Your Cache (And Your Productivity)
Let's break things on purpose to see why layer order matters. Edit your layered_image/Dockerfile and move the line COPY layered_image/app/ ./app/ to before the RUN pip install ...line. Make another small change to app/predictor.py and rebuild.
What happens? The build takes the full 23 seconds again! Your innocent code change busted the cache at the (now earlier) COPY step. Since pip install comes after this busted cache, it has to run from scratch too. This is why getting your layer order right isn't just a nice-to-have; it's essential for your sanity.
B. The Art of the RUN Command (The Key to Microscopic Layers)
Here's something that trips up a lot of people: every RUN command creates a new, permanent layer. Once a file exists in a layer, you can't truly delete it from your image size, even if you remove it in a later layer. It's like trying to erase something written in permanent marker by writing over it. The original is still there underneath, taking up space.
Our testing shows that a basic RUN pip install ... creates a layer weighing in at 679MB. But watch what happens when we chain everything into one command:
# LAYER OPTIMIZATION: Install runtime dependencies and clean up in a single layer
RUN pip install --no-cache-dir -r runtime_requirements.txt && \
pip cache purge && \
rm -rf /tmp/* /var/tmp/* && \
find /usr/local/lib/python*/site-packages/ -name "*.pyc" -delete && \
find /usr/local/lib/python*/site-packages/ -name "__pycache__" -type d -exec rm -rf {} + || true
This single command creates a layer of just 572MB. That's 107MB saved just by doing our cleanup in the same breath as the install. If you check docker history, you'll see one lean 572MB layer instead of a bloated 679MB layer followed by useless cleanup attempts.
Pro tip for BuildKit users: If you're running a recent Docker version, throw # syntax=docker/dockerfile:1.7 at the top of your Dockerfile to unlock some sweet BuildKit features:
# syntax=docker/dockerfile:1.7
# ...
RUN --mount=type=cache,target=/root/.cache/pip pip install -r requirements.txt
This tells BuildKit to maintain a persistent pip cache that lives outside your image layers. You get faster builds without the bloat. Win-win!
C. The Gatekeeper: Mastering .dockerignore
When you run docker build ., that innocent-looking. tells Docker to package up everything in the current directory and ship it to the build daemon. For AI projects, "everything" might include multi-gigabyte datasets, model checkpoints, Jupyter notebooks, and (heaven forbid) your virtual environment. Sending all this junk isn't just slow; it's a security nightmare waiting to happen.
Enter .dockerignore, your bouncer at the door. It stops files from even getting into the build context. For AI projects, a solid .dockerignore isn't optional.
Use this comprehensive .dockerignore file suitable for AI projects.
# .dockerignore
# Python virtual environments
.venv/
# Python caches
__pycache__/
# IDE and OS specific
.vscode/
.idea/
# ... etc.
Quick note about .git: Usually, you want to exclude it. But if you're doing something fancy like embedding commit info in your container, you might need it. For 99% of cases, though? Ignore it.
The Final Result: A Production-Ready Blueprint
By combining smart architecture with mechanical precision, we end up with our final layered_image Dockerfile. Here's how far we've come:
| IMAGe | SIze | time to build |
|---|---|---|
|
2.37 GB |
56 seconds |
|
|
1.5 GB |
51 seconds |
|
|
827 MB |
24 seconds |
|
|
720 MB |
25 seconds |
Conclusion
Your Dockerfile isn't just some build script; it's the blueprint for your app's entire runtime environment. Once you understand how it really works, you can craft containers like a pro. We've seen how smart architecture (like multi-stage builds) combined with attention to detail (like cache optimization and layer management) can transform a bloated mess into a lean, professional container. These aren't just tricks; they're fundamental engineering principles for building efficient AI systems.
Now it's your turn. Go look at your own Dockerfiles. Can you reorder those layers for better caching? Could you chain those RUN commands to actually delete temporary files? Is your .dockerignore actually doing its job?
Opinions expressed by DZone contributors are their own.
Comments