Cagent: Dockers newest low code Agentic Platform

Docker’s cagent is a new open-source, low-code/ YAML-centric AI agent builder and runtime. Instead of writing code, you describe agents and cagent runs them.

Siri Varma Vegiraju

CORE ·

Feb. 25, 26 · Analysis

Likes (0)

Comment

Save

1.8K Views

Cagent is the new open-source framework from Docker that makes running AI agents seamless and lightweight. With Cagent, you can start with simple “Hello World” agents and scale all the way to complex, multi-agent processing workflows. It provides core agent capabilities such as autonomy, reasoning, and action execution, while also supporting the Model Context Protocol (MCP), integrating with Docker Model Runner (DMR) for multiple LLM providers, and simplifying agent distribution through the Docker registry.

Unlike traditional agentic frameworks that treat AI agents as programmatic objects requiring extensive Python or C# code, Cagent incorporates a declarative, configuration-first philosophy. So, instead of managing complex dependencies and writing custom orchestration logic, developers define their agent’s persona and capabilities within a single, portable YAML file, effectively decoupling logic from the underlying infrastructure.

However, this simplicity comes with the trade-off that Cagent excels at rapid deployment and standardized tasks, sacrificing granular programmatic control found in the traditional agentic frameworks. In other words, Cagent is designed for portability and execution speed, whereas LangGraph or Autogen is built for architectural flexibility and complex reasoning loops.

Prerequisites for Docker Cagent

To start using Cagent, you’ll need Docker Desktop 4.49 or later. If you’re using Docker Engine without Docker Desktop, you can install Cagent directly using your operating system’s package manager.

On macOS, install the Cagent using Homebrew:

    PowerShell
   
   brew install agent

On Windows, you can install it using winget:

    PowerShell
   
   winget install Docker.cagent

To verify the cagent installation, run the following command.

    PowerShell
   
   cagent version

Building Your First Agent Using Docker Cagent

As part of this example, we will define a specialized Technical Writer agent using the openai/gpt-4o model.

Write the following configuration to an assistant.yaml file.

    PowerShell
   
   version: "1"

agents:

  root:

    model: openai/gpt-4o

    description: "A professional technical writer who simplifies complex DevOps topics."

    instruction: |

      You are an expert technical writer. 

      Your goal is to explain technical concepts clearly and concisely.

      Always use Markdown for formatting and include code snippets where relevant.

Set your API Key:

    PowerShell
   
   export OPENAI_API_KEY=your_key_here

Execute the cagent run command.

    PowerShell
   
   cagent run assistant.yaml

Once running, you can chat with your agent in the terminal. It’s now a specialized containerized assistant ready for work.

Using Cagent With MCP (Model Context Protocol)

MCP is an open-source standard designed by Anthropic to connect agents with real-world entities, such as databases, search engines, and APIs.

For this example, we will build a “Gemini Expert” agent responsible for searching and retrieving the Gemini API documentation using the Gemini MCP tool.

First, define the gemini_expert.yaml file.

    PowerShell
   
 

   version: "1"
agents:
  root:
    model: anthropic/claude-3-5-sonnet # High reasoning model for complex docs
    description: "Technical specialist for Google Gemini API and SDKs."
    instruction: |
      You are a Gemini API Specialist. When a user asks how to implement a 
      feature (like context caching or multimodal input), use the 
      'gemini-api-docs' tools to:
      1. Search for relevant documentation entries.
      2. Retrieve code snippets and implementation guides.
      3. Explain the best practices based on the fetched docs.
    toolsets:
      - type: mcp
        ref: docker:gemini-api-docs # Reference the server from the Docker Hub MCP catalog
  

Then, once we execute the run cagent run gemini_expert.yaml command, you will see Cagent using the Gemini MCP server to list APIs available in the documentation.

Here, the toolset is what we will use to define the MCP server. Examples of some other toolsets available are filesystem to work with files and directories, shell to execute commands, think for reasoning, and memory to store and retrieve information across conversations and sessions.

Multi-Agent Workflows With Cagent

Now, let's see how we can use the Gemini API documentation agent along with a technical writer agent to come up with some steps on how a project manager can build a simple agent using the Gemini API.

We will call this multi-agent.yaml.

    PowerShell
   
 

   version: "1"
agents:
  root:
    model: openai/gpt-4o
    description: "Project Manager for AI Integration guides."
    instruction: |
      You are the Project Manager. Your goal is to explain how to build an agent using the Gemini API.
      1. Ask the 'researcher' to find the specific Gemini API methods for "System Instructions" and "Tool Use".
      2. Send the researcher's findings to the 'writer'.
      3. Ensure the final blog post includes a clear code example discovered by the researcher.
    sub_agents:
      - researcher
      - writer

  researcher:
    model: openai/gpt-4o-mini
    description: "Gemini Documentation Specialist."
    instruction: |
      You gather technical specifications from the Gemini API documentation.
      Focus on finding:
      - How to initialize the model.
      - How to pass 'system_instruction' to an agent.
      - The syntax for 'tools' (function calling).
    toolsets:
      - type: mcp
        ref: docker:gemini-api-docs # This connects directly to Google's structured docs

  writer:
    model: openai/gpt-4o
    description: "Technical Content Creator."
    instruction: |
      You take technical research notes and turn them into a polished blog post.
      Explain the Gemini API implementation in a way that a developer can follow.
      Always include a Python or Node.js snippet based on the researcher's data.
  

The root (manager) which is the only agent that talks to the user. It is responsible for orchestration by breaking down the tasks and assigning them to the right specialists.
The sub-agents (specialists) are hidden from users and only speak when the root calls them.

So, when executing the above YAML configuration, we will see the root agent coordinating the workflow by delegating research and writing tasks to its specialized sub-agents and consolidating their outputs into a single, user-facing response.

Finally, if you would like to share the agent with the broader community, push the configuration to Docker Hub just like a container image:

    PowerShell
   
   cagent push .multi-agent.yaml your-username/tech-team:v1

Conclusion

Docker Cagent makes it simple to build and run AI agents by adopting a configuration-first approach. It removes the friction of manual orchestration and complex coding patterns for rapid development. With its support for a variety of tools and models, it allows you to build a resilient, vendor-agnostic AI stack.

AI Docker (software)

Opinions expressed by DZone contributors are their own.

Related

Trending