Agents Need Deterministic Sandboxes
I still want my coding agents to run in a robust sandbox by default, one that restricts file access and network connections in a deterministic way. I trust those a whole lot more than prompt-based protections like this new auto mode. @simonw
Claude presents two options out of the box:
- babysit its every action or
- give it free rein over the system with
--dangerously-skip-permissions.
I’m tired of babysitting my coding agents, but I don’t want them running rm -rf /. The approach Auto mode for Claude uses
to take the burden off of us is an application of an LLM as
Judge system. Claude nests
untrusted prompts within trusted ones to interrogate their safety. E.g. “Is it
safe to run the following prompt: <UNVALIDATED_PROMPT_HERE>”.
This is an improvement! It’s, however, akin to protecting against script-injection by injecting into a “safer” script first. SQL and shell injection have shown us time and time again that there's no replacement for a real sandbox.
I believe there's a third way beyond babysitting and chaos that lets me run Claude where I have all my tools and context (instead of having to spin up and juggle all the tools, credentials, etc. installed on a Sprite or exe.dev instance). One that's both safer and more productive: run each action Claude requests in its own action-specific sandbox..
That's why we made Clash, a sandbox built for users first.
The insight: I trust the Claude Code harness itself differently than the actions the LLM takes through it. Throwing the whole shebang in a container isn't actually performing the job I want. What I want is for Claude to extend my abilities with all the context and power I have locally (including helping review PRs, pushing to dev branches, performing QA on secure staging environments, etc.). To do that I need different actions to be sandboxed differently. I don't want to babysit, but I want control over what Claude runs and visibility into what's happening as a result. Safety and control shouldn't come at the cost of productivity. I should be able to do this without containerizing or replicating my environment across multiple machines.
Clash policies match actions being performed to an OS-enforced sandbox (on macOS
w/
Seatbelt
and Linux w/ Landlock) you want them to run in. Claude
provides a plugin interface to intercept every action performed. Clash hooks
into this layer and rewrites / wraps every action performed in a session.
Want Claude to be able to curl, but only to your local server running on port
8080? Done. git fetch can pull from GitHub but nothing else can read your
.ssh/ keys? Done. Want any bash command to be able to safely read from your
repository directory but not see anything else? Done. With Clash we can
control exactly which tools Claude can use and route any shell command to
precise OS-enforced sandboxes.
The project is a few months old now. We’ve spent a majority of our time iterating on the ergonomics of policy creation and management. We’ll soon post about our journey (it started with a YAML DSL, then S-expressions which grew into their own Scheme… and now Starlark+JSON). It’s a great experience once you settle into a policy that fits your workflow. We’ve found meaningful productivity (and peace of mind) gains once we’ve gotten set up, and we're working on prebuilt "policy packs" to bootstrap new users. If Clash resonates with you, please give it a try!
Head over to the quickstart guide to get started. Please file any and all issues you might find (or feedback you have) on the Clash GitHub project.