Open Source • Apache 2.0

agentsh is

Agents can be tricked, jailbroken, or simply wrong. agentsh enforces what they can do at runtime—no matter what the prompt, tool output, or user says.

Enforce at the system boundary. Prompts are advisory; syscalls are real. agentsh evaluates every file/network/process action and applies your policy before it happens.

Runtime Visibility
Policy Enforcement
Audit Events
agentsh session
agentsh exec $SID -- curl https://api.example.com
net_connect → api.example.com:443 allow
file_write → /tmp/response.json allow
agentsh exec $SID -- rm -rf /workspace/cache
delete → /workspace/cache approve?
↳ "Confirm delete? [y/N]"
agentsh exec $SID -- cat ~/.ssh/id_rsa
file_read → ~/.ssh/id_rsa deny

The agent proposes.
The policy decides.

Prompt injection, jailbreaks, and plain old mistakes all look the same at the OS boundary. agentsh intercepts runtime actions and enforces your rules—allow, deny, approve, or redirect—with a complete audit trail.

Prompt-proof enforcement

Prompts can drift. Policies don’t. Enforcement happens at the system call level—where files open, sockets connect, and processes spawn.

See everything

Every file, network, and process operation is captured—including subprocess trees—so you can understand what really happened.

Approval gates

Risky operations pause for explicit confirmation. Agents can request, but humans (or CI policy) decide.

Why “just tell the agent not to” fails

Prompt injection — Content in files, web pages, or tool output can hijack instructions mid-run.
Jailbreaks — Clever phrasing can persuade the model to “make an exception” for something unsafe.
Reasoning errors — Agents misunderstand tasks and take destructive actions “by accident.”
Subprocess blind spots — pip installs, npm scripts, makefiles: tools spawn work the agent can't fully "see."
Secret exposure — Environment variables, .env files, mounted credentials—all accessible to the agent, even in containers.

Drop-in runtime gateway

Place agentsh under your agent or harness. It intercepts syscalls, applies your policy, and emits structured events you can route anywhere.

Agent / Harness
Claude, GPT, Cursor…
request
agentsh
Policy engine
allow
deny
approve
if allowed
System
Files, network, processes
events
Audit Log
Structured events
allow
Operation proceeds normally
deny
Operation blocked with message
approve
Human confirmation required
redirect
Swap to safe alternative
audit
Allow + detailed logging
soft_delete
Quarantine with restore

Start in under a minute

1 Install agentsh

Debian/Ubuntu
# Download from GitHub releases
sudo dpkg -i agentsh_<VERSION>_linux_amd64.deb

# Or build from source
make build
sudo install -m 0755 bin/agentsh /usr/local/bin

2 Run through it

Terminal
# Create a session
SID=$(agentsh session create --workspace . | jq -r .id)

# Run commands through agentsh
agentsh exec "$SID" -- ls -la

# Structured output for agents
agentsh exec --output json "$SID" -- curl https://example.com

3 Tell your agent to use it

AGENTS.md / CLAUDE.md Add to your repo
## Shell access

- Run commands via agentsh (not directly in bash/zsh).
- Use: agentsh exec $SID -- <your-command>
- For structured output: agentsh exec --output json $SID -- <cmd>
- Get session ID first: SID=$(agentsh session create --workspace . | jq -r .id)
Container Deployment

Containers isolate.
agentsh governs.

Containers limit where an agent can cause damage—but inside the container, it's still a free-for-all. The agent can read any file, access your env vars and secrets, hit any endpoint, and delete your workspace. agentsh adds the missing layer: control over what the agent can actually do.

Install a lightweight shell shim in your container. The agent thinks it's calling /bin/bash—but every command routes through agentsh and gets policy-checked.

  • No changes to agent code or prompts
  • Works with any framework (Claude Code, Cursor, custom)
  • Captures subprocess trees spawned by scripts
  • Same allow/deny/approve/redirect decisions
Dockerfile
FROM debian:bookworm-slim

# Install agentsh
RUN dpkg -i agentsh_*_linux_amd64.deb

# Install the shell shim — this is the magic
RUN agentsh shim install-shell \
  --root / \
  --shim /usr/bin/agentsh-shell-shim \
  --bash \
  --i-understand-this-modifies-the-host

# Point to agentsh server (sidecar or host)
ENV AGENTSH_SERVER=http://127.0.0.1:8080

# Now any /bin/bash or /bin/sh call goes through agentsh
# Agents never know the difference
What happens: The shim swaps /bin/bash and /bin/sh. When an agent calls subprocess.run(["bash", "-c", "…"]), it actually hits agentsh—which applies policy and logs the outcome.
Full Harness Protection

Wrap the harness, not just the shell

Harnesses like Claude Code or Cursor include built-in tools that bypass the shell—file edits, execution, and network requests. Run the harness itself under agentsh to govern everything end-to-end.

The gotcha: When a harness writes a file via an internal tool (e.g., str_replace), no shell is spawned. If you only shimmed bash, you’d miss it.

  • Catches built-in file/network tools, not just shell commands
  • Policy applies to the harness and everything it spawns
  • One complete audit trail for the entire agent run
  • Works with Claude Code, Cursor, Aider, or custom harnesses
Wrap the entire harness
# Create a session for the entire agent run
SID=$(agentsh session create \
  --workspace /project \
  --policy agent-sandbox | jq -r .id)

# Run your agent harness UNDER agentsh
agentsh exec "$SID" -- claude-code --project /project

# Or with Cursor, Aider, custom harness...
agentsh exec "$SID" -- cursor --folder /project
agentsh exec "$SID" -- python my_agent.py
What gets captured
# Even built-in harness tools are governed:
├─ file_write → /project/src/main.py      ✓ allow
├─ file_read  → /etc/passwd               ✗ deny
├─ net_connect → api.openai.com:443       ✓ allow
├─ net_connect → evil.com:443             ✗ deny
├─ exec       → npm install               ✓ allow
│  └─ file_write → /project/node_modules  ✓ allow
│  └─ net_connect → registry.npmjs.org    ✓ allow
├─ exec       → curl http://attacker.com  ✗ deny
└─ file_delete → /project/.env            ? approve

Policies as code

Define what’s allowed, what needs approval, what gets redirected, and what’s blocked—using simple YAML you can version, review, and ship.

policies/dev-safe.yaml
file_rules:
  - name: allow-workspace
    paths: ["/workspace/**"]
    operations: [read, write, create]
    decision: allow

  - name: approve-delete
    paths: ["/workspace/**"]
    operations: [delete]
    decision: approve
    message: "Delete {{.Path}}?"

  - name: deny-secrets
    paths: ["**/.env", "**/.env.*", "**/credentials*"]
    decision: deny

  - name: deny-ssh
    paths: ["~/.ssh/**"]
    decision: deny

network_rules:
  - name: allow-api
    domains: ["api.example.com"]
    ports: [443]
    decision: allow

command_rules:
  - name: block-env-dump
    commands: [env, printenv]
    decision: deny

  - name: block-dangerous
    commands: [rm, shutdown, reboot]
    decision: deny

env_rules:
  - name: npm-registry-only
    commands: [npm, yarn]
    allow: [NPM_TOKEN, NODE_ENV]

  - name: db-migrate-only
    commands: [prisma, drizzle]
    allow: [DATABASE_URL]

Starter policy packs

dev-safe.yaml
workspace local

Great default for local work. Workspace access, env/secrets/SSH denied, and a tight network allowlist.

ci-strict.yaml
CI/CD strict

Designed for CI runners. Deny outside workspace and restrict network to registries and required endpoints.

agent-sandbox.yaml
sandbox paranoid

For running unknown code. Default deny, explicit allowlists, approvals, and soft-delete quarantine.

Escape the Retry Loop

Redirect, don’t just deny

A deny often triggers “try harder.” Different flags, different paths, Base64 encoding, creative workarounds. You’ve seen it: dozens of retries, wasted tokens, and the task still doesn’t complete.

The deny spiral:

curl https://example.com → denied
wget https://example.com → denied
python -c "import urllib..." → denied
nc -v example.com 443 → denied
... 10 more attempts, still denied

With redirect:

curl https://example.com → redirected to agentsh-fetch
Operation succeeds, agent continues
You control what actually happens
Full audit trail preserved

Redirect keeps work moving. The agent sees success and proceeds. You decide where writes land, which endpoints get hit, and what command is truly executed—without breaking the flow.

redirect policy examples
command_rules:
  # Route downloads through audited proxy
  - name: redirect-curl
    commands: [curl, wget]
    decision: redirect
    message: "Downloads routed through audit"
    redirect_to:
      command: agentsh-fetch
      args: ["--audit"]

file_rules:
  # Agent tries /tmp? Fine, but it lands in workspace
  - name: redirect-outside-writes
    paths: ["/tmp/**", "/var/**", "/home/**"]
    operations: [write, create]
    decision: redirect
    redirect_to: "/workspace/.scratch"
    message: "Writes redirected to workspace"
    
  # Trying to read secrets? Get a placeholder instead
  - name: redirect-secrets
    paths: ["**/.env", "**/*secret*", "**/*credential*"]
    operations: [read]
    decision: redirect
    redirect_to: "/workspace/.mock-secrets"
    message: "Secret access redirected to mock"
Great for: sandboxing untrusted code, mocking external services, enforcing workspace boundaries, and auditing network access—without derailing the agent.

Put guardrails under your agents

Open source. Practical by default. Built for real agent runs—where mistakes are expensive.

Need centralized policy, approvals, kill-switches, and SIEM hooks?

Explore Enterprise