Overview

Working with Claude Code in its default mode means constant interruptions — approve this command, allow that file write, confirm before running tests. The safety prompts make sense in theory, but in practice they break the flow and force me to babysit something I'm trying to delegate.

What I actually wanted was the opposite: eliminate every barrier inside the codebase so the agent can work without asking permission, while putting stronger guardrails around everything outside it. Let the agent run free in the sandbox; make sure it can't touch anything beyond the sandbox walls.

So I built Agent Sandbox: a Bash CLI that runs Claude Code inside an isolated Docker container in "YOLO mode" (--dangerously-skip-permissions). The agent gets full, unrestricted access to whatever repo I point it at — no approval prompts, no restrictions on what it can read, write, or execute. The host machine stays completely protected. When the session is done, the container disappears.

How it works

The core is a single agent command that orchestrates Docker. When I start a session, it builds a minimal Ubuntu 24.04 image with Node.js, Python, Java, the GitHub CLI, and Claude Code pre-installed, then runs the container with the right mounts and environment wired up.

The non-obvious piece was credential bridging. Claude Code on macOS stores its OAuth token in the system Keychain. The entrypoint script extracts that token on the host side and rewrites it into the Linux credential format expected inside the container — so the agent authenticates as me, without ever baking secrets into the image. GitHub tokens follow the same pattern via gh auth token.

Sessions are persistent even though containers are ephemeral. Each session gets a directory under ~/.agent-sandbox/sessions/ holding metadata and a conversation history folder that gets mounted into the container at startup. Resuming a session picks up the same Claude context where it left off.

A few other things I built in:

The whole thing is ~900 lines of Bash with no external dependencies beyond Docker, jq, curl, and the GitHub CLI.

Links