Top 5 Practices for Building Dockerized MCP Servers
Learn the best practices for building MCP Servers and use them to power your LLM-powered applications. Make sure your setup has isolation and is secure.
Join the DZone community and get the full member experience.
Join For FreeThe Model Context Protocol (MCP) is changing how we build software. It provides the "API" for large language models (LLMs) to interact with the real world. This lets an AI agent query a database, read a file, or call a third-party service. This new capability brings new challenges. MCP servers, the back-end tools the AI uses, are not traditional microservices. Their user is a non-deterministic AI, and they often need access to sensitive systems.
How do we build, deploy, and secure these servers reliably? The clear answer is Docker. The entire MCP ecosystem, including Docker's own MCP Toolkit and Catalog, is built around containerization. Running your MCP servers in Docker is not just a good idea; it is a necessary best practice. This article covers five key principles for building production-ready, Dockerized MCP servers.
1. Containerize Everything for Isolation and Portability
Your first and most important rule is that an MCP server should always be a Docker image. Never run it as a raw script on your host machine. The reasons for this come down to two key factors: isolation and portability.
Security is very important while building a service. An AI agent can be unpredictable. A harmful prompt could trick it into calling your tool with dangerous inputs. If your pdf-reader tool has a vulnerability, you don’t want it running with the same user permissions as your aws-s3-uploader. A Docker container creates a secure environment, separating the server's process, filesystem, and network.
Portability addresses the "it works on my machine" problem. Your Python-based Postgres server and your Node.js-based GitHub server have different, conflicting dependencies. Docker combines the server and all its dependencies into a single, portable package that works the same everywhere.
2. Adhere to the Principle of Single Responsibility
The given MCP server is, in short, a microservice that should follow the best microservice architecture, which is the Single Responsibility Principle (SRP). For MCP, this means having one server for each logical domain.
Avoid the urge to create a large "god server" that connects to GitHub, Postgres, and Stripe all at the same time. Instead, build smaller, focused servers like mcp-github-server, mcp-database-server, and mcp-billing-server.
This approach offers two major benefits. First, it improves security. You can apply a "least privilege" model. The GitHub server receives a read-only token and does not have network access to your database. If it's compromised, the impact is minimal. Second, it provides clarity. It's much easier for an LLM to work with a small set of specific tools from a "Database" server than to choose the right tool from a long list of 50 different functions.
3. Write "Agent-First" Tool Descriptions
The "user" in your tool's description refers to the AI agent, not a human developer. Your goal should be to ensure a clear understanding, not to be cleverly concise. A large language model lacks human "common sense." It cannot infer your intent. It depends fully on the string descriptions and JSON schemas in your tool definition to grasp what a tool does, what its parameters mean, and when to use it.
- Unclear ❌: name: "get_issue", description: "Gets a JH issue."
- Why is it bad ❌: What's "JH"? How many issues does it get, just one or many? What is the identifier that it uses to get it? It will confuse the LLM, and it has to guess.
- Some good names ✅: name: "get_github_issue_by_number"
- Why is it good ✅: description: "The command fetches the complete exhaustive details for a GitHub issue from a given repository including the title, body, author, and tags. To use this tool requires the full repository name and the numerical issue number."
The "good" example is wordy, but it's ideal for an LLM. It clearly explains what it does ("fetches full details"), what it returns ("title, body, author, labels"), and what it needs ("full repository name," "numerical issue number"). This greatly improves the "hit rate," or the chance that the agent will successfully use your tool.
4. Harden Your Docker Image
You are building a service that an AI can call whenever needed. This creates a new security risk. Your Docker image is your first line of defense, so it needs to be secure. If an attacker discovers a weakness in your server's code, your image hardening should leave them with no tools to strengthen their attack.
To achieve this, use minimal base images like distroless or Alpine, which do not include a shell or a package manager. Make sure to always run the application/service as a “non-root” user, using the USER instruction in your Dockerfile. An attacker who gains access will not have root permissions.
Lastly, use multi-stage builds to copy only the final application artifacts into your production image. This leaves behind build tools, source code, and compilers. Incorporate a vulnerability scanner, such as Docker Scout, into your CI/CD pipeline. This will automatically scan your images for known vulnerabilities and fail the build if it finds a serious issue.
5. Test Like a Real Service, Not a Mock
Unit tests with mocks can be fragile and often misrepresent real-world behavior. For MCP servers, integration testing is necessary. The best way to do this in a containerized environment is by using Testcontainers.
Testcontainers is a library that allows your test code to start and stop any Docker container programmatically. Instead of mocking a database, your pytest or JUnit test can launch a real, temporary Postgres container on a random port.
Your test then starts your mcp-database-server (also as a container) and configures it to connect to this temporary database. It also acts as an MCP client to call your query_users tool. You can then check that you receive real data from a real database. When the test is finished, Testcontainers automatically removes both containers. This detailed test can catch real-world errors in connection strings, SQL syntax, and permissions that mocks would miss.
Conclusion
Building for an AI-first world requires a new discipline. Treat your MCP servers as top-tier, containerized microservices. Follow these five practices: containerize everything, embrace single responsibility, write agent-first descriptions, harden your images, and validate with Testcontainers. This approach will create a solid, production-ready foundation for the next generation of AI-powered applications.
Opinions expressed by DZone contributors are their own.
Comments