DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Cagent: Dockers newest low code Agentic Platform
  • Building AI Agents Using Docker cagent and GitHub Models
  • Top 5 Practices for Building Dockerized MCP Servers
  • Docker Security: 6 Practical Labs From Audit to AI Protection

Trending

  • Why Stable RAG Answers Can Still Hide Unstable Evidence
  • Minimus Expands Enterprise Security Platform with General Availability of Advanced Supply Chain Controls
  • A Practical Blueprint for Deploying Agentic Solutions
  • How to Interpret the Number of Spring ApplicationContexts in Integration Tests
  1. DZone
  2. Software Design and Architecture
  3. Containers
  4. Run AI Agents Safely With Docker Sandboxes: A Complete Walkthrough

Run AI Agents Safely With Docker Sandboxes: A Complete Walkthrough

A full walkthrough of how to set up Docker sandboxes on a local machine and how to run AI agents safely in YOLO mode without corrupting the host environment.

By 
Naga Santhosh Reddy Vootukuri user avatar
Naga Santhosh Reddy Vootukuri
DZone Core CORE ·
Apr. 07, 26 · Tutorial
Likes (3)
Comment
Save
Tweet
Share
5.9K Views

Join the DZone community and get the full member experience.

Join For Free

There are days when I want an agent to work on a project, run commands, install packages, and poke around a repo without getting anywhere near the rest of my machine. That is exactly why Docker Sandboxes clicked for me.

The nice part is that the setup is not complicated. You install the CLI, sign in once, choose a network policy, and launch a sandbox from your project folder. After that, you can list it, stop it, reconnect to it, or remove it when you are done.

In this post, I am keeping the focus narrow on purpose: Set up Docker Sandboxes, run one against a local project, understand the few commands that matter, and avoid the mistakes that usually slow people down on day one.

What Are Docker Sandboxes?

Docker Sandboxes give you an isolated environment for coding agents. Each sandbox runs inside its own microVM and gets its own filesystem, network, and Docker daemon.

The simple way to think about it is this: the agent gets a workspace to do real work, but it does not get free access to your whole laptop.

That is the reason this feature is interesting. You can let an agent install packages, edit files, run builds, and even run Docker commands inside the sandbox without turning your host machine into the experiment.

Before You Start

You do not need a big lab setup to try this, but you do need:

  • macOS or Windows machine installed
  • Windows "HypervisorPlatform" feature enabled
  • Docker Sbx CLI installed
  • API key or authentication for the agent you want to use

If you start with the built-in shell agent, Docker sign-in is enough for your first walkthrough. If you want to start with claude, copilot, codex, gemini, or another coding agent, make sure you also have that agent's authentication ready.

If you are on Windows, make sure Windows Hypervisor Platform is enabled first.

PowerShell
 
Enable-WindowsOptionalFeature -Online -FeatureName HypervisorPlatform -All


If Windows asks for a restart, do that before moving on.

Note: Docker documents the getting-started flow with the sbx CLI. There is also a docker sandbox command family, but sbx is the cleanest way to get started, so that is what I am using in this walkthrough.

Step 1: Install the Docker Sandboxes CLI

On Windows:

PowerShell
 
winget install -h Docker.sbx


On macOS:

PowerShell
 
brew install docker/tap/sbx


That is it for installation.

If sbx is not recognized immediately after install, open a new terminal window and try again. I hit that once on Windows after installation, and a fresh terminal fixed it.

Note: Docker Desktop is not required for sbx.

Step 2: Sign In

Now sign in once:

PowerShell
 
sbx login


This opens the Docker sign-in flow in your browser. During login, Docker asks you to choose a default network policy for your sandboxes:

  • Open – Everything is allowed
  • Balanced – Common development traffic is allowed, but it is more controlled
  • Locked down – Everything is blocked unless you explicitly allow it

If you are just getting started, pick Balanced.

That is the easiest choice for a first run because it usually works without making the sandbox too open.

Step 3: Pick a Small Project Folder

You can use an existing project folder, or create a tiny test folder just for this walkthrough.

For example:

PowerShell
 
mkdir hello-sandbox
cd hello-sandbox


If you want, drop a file into it so you have something visible inside the sandbox:

PowerShell
 
echo "# hello-sandbox" > README.md


Nothing fancy is needed here. The goal is just to have a folder you are comfortable letting the agent work in.

Step 4: Run Your First Sandbox

Here is the command that matters most:

PowerShell
 
sbx run shell .


How to create a new sandbox using Sbx command

Figure 1.1: Shows how to create a new sandbox using Sbx command


What this does:

  • Starts a sandbox for the shell agent
  • Mounts your current folder into the sandbox
  • Opens an isolated environment where the agent can work on that folder

If you prefer naming your sandbox from the start, use:

PowerShell
 
sbx run --name my-first-sandbox shell .


On the first run, Docker may take a little longer because it needs to pull the agent image. That is normal. Later runs are much faster.

I like starting with shell because it is the easiest way to prove the sandbox is working before you bring an actual coding agent into the mix.

Once that works, replace shell with the agent you actually want to use, such as claude, copilot, codex, gemini, or another supported agent from the Docker docs.

Step 5: See What Is Running

To check your active sandboxes, run:

PowerShell
 
sbx ls


You should see output with a name, status, and uptime.

This is a handy command because once you start using sandboxes regularly, it becomes the quickest way to see what is still running and what needs cleanup.

How to verify list of all active sandboxes running on the machine

Figure 1.2: Shows how to verify list of all active sandboxes running on the machine


Step 6: Switch to a Real Coding Agent

Once you have proved the sandbox works with shell, move to the coding agent you actually want to use.

For example:

PowerShell
 
sbx run copilot

How to run Copilot agent on Docker sandbox

Figure 1.3: Shows how to run Copilot agent on Docker sandbox


or 

PowerShell
 
sbx run gemini

How to run gemini agent on Docker sandbox

Figure 1.4: Shows how to run gemini agent on Docker sandbox


The workflow is the same as shell. The only thing that changes is the agent inside the sandbox.

If the agent needs its own provider login or API key, complete that setup and then continue. The important point is that the agent is still running inside the sandbox, not directly on your host machine.

Step 7: Stop the Sandbox When You Are Done

When you are finished using Sandbox, you can stop it by running the command below:

PowerShell
 
sbx stop copilot-dockersandboxtest


If you don't remember the name, run sbx ls first to see all the active sandboxes running. Stopping is useful when you want to pause work without removing the sandbox immediately.

Step 8: Remove the Sandbox When You No Longer Need It

When you are done for good, you can remove it by running the command below:

PowerShell
 
sbx rm copilot-dockersandboxtest


Or remove all sandboxes by simply passing --all flag as shown below:

PowerShell
 
sbx rm --all

Removing all sandboxes using sbx rm --all command

Figure 1.5: Removing all sandboxes using sbx rm --all command


Step 9: Use YOLO Mode Safely

Now for the newer idea Docker has just announced, which is YOLO mode. If you want to read more about it, refer to Docker's recent blog post, which is worth bookmarking: Docker Sandboxes: Run Agents in YOLO Mode, Safely.

In simple terms, YOLO mode means letting a coding agent work with fewer interruptions and fewer approval prompts. That can save time, but it only makes sense when the agent is already inside a sandbox.

Note: I would not start with YOLO mode on day one. I would start with a normal sandbox run, get comfortable with the lifecycle first, and only then try YOLO mode.

Conclusion

This article explains Docker Sandboxes and provides step-by-step instructions for getting started. What I like about Docker Sandboxes is that they remove a lot of friction from a very real problem. Sometimes you want an agent to have freedom, but not too much freedom. You want it to run commands, inspect files, and do useful work, but you also want a clear boundary around that work. That is the sweet spot Docker Sandboxes are aiming for. If you are curious about them, my advice is simple: do not start with a giant repo or a complicated setup. Pick one small folder, use the Balanced policy first, run a single sandbox, and get comfortable with the basic lifecycle first. Once that clicks, the rest feels much easier to work in YOLO mode.

AI Docker (software)

Opinions expressed by DZone contributors are their own.

Related

  • Cagent: Dockers newest low code Agentic Platform
  • Building AI Agents Using Docker cagent and GitHub Models
  • Top 5 Practices for Building Dockerized MCP Servers
  • Docker Security: 6 Practical Labs From Audit to AI Protection

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook