16 Bounded Autonomy and Permission Design
Autonomy is not a slider you turn up; it is a contract you write — one that says exactly what the agent may do, where its tries can land, and how you get back to safety when it drifts.
Start with the execution mode you chose in Chapter 6, then write the permission contract that makes that mode safe. The canonical six stops from Chapter 6 are Manual, Assisted editing, Plan mode, Act mode, Autonomous loop, and Background queued. The mode answers how the agent works with you; the permission contract answers what the harness allows: reads, writes, shell commands, network calls, external tools, prompts for help, and recovery exits.
Those axes compose. Manual work should require explicit human action for mutation. Assisted editing may allow IDE-local diffs but still ask before arbitrary shell commands. Plan mode should be read-only. Act mode can write inside the agreed task boundary and run agreed checks. Autonomous loop and Background queued both need allow-lists, hard denies, budgets, pause artifacts, and fail-closed ambiguity handling. Bounded autonomy is the discipline of choosing those rules before the run starts instead of improvising after the agent drifts.
| Execution mode from Chapter 6 | Typical permission contract |
|---|---|
| Manual | No autonomous writes; the human performs or approves every change. |
| Assisted editing | IDE-local diffs and agreed checks; shell, network, and shared-branch actions ask. |
| Plan mode | Read-only tools plus analysis; mutation denied until the plan is approved. |
| Act mode | Workspace writes and named test/lint commands inside the task boundary; risky paths ask or deny. |
| Autonomous loop | Explicit allow-list, hard denies, turn/token/wall-clock budgets, verifier gates, and structured handoff on ambiguity. |
| Background queued | The Autonomous loop contract plus visible queue status, cancellation, and handoff artifacts for asynchronous review. |
16.1 Bounded Autonomy: Granting Freedom Where It Pays Off
The decision to let an agent run unattended belongs to Chapter 6, and the inner/outer loop boundary itself is established in Chapter 8. What this chapter owns is the corollary: each loop needs a different permission contract because feedback latency changes the acceptable blast radius. The inner loop runs fast under your eye and tolerates broad edit/test latitude. The outer loop runs slow, mutates shared state, and tolerates almost none. Apply one contract to both and you misconfigure both — the inner loop gets interrupted into uselessness, or the outer loop runs unattended over a production database [289]. A practical split: the inner-loop profile permits unrestricted reads, edits, and test runs but hard-denies git push, deploy commands, and migration runners; the outer-loop profile requires explicit approval before any shared-branch mutation. Tools that support per-phase workflow files let you commit this split as configuration rather than a habit you have to remember [290] — a workflow.md in .cline/ for permissive iterative sessions, a release.md for conservative integration runs, both versioned with the repo so a teammate joining mid-cycle inherits the right posture.
Headless invocation is where the contract gets stress-tested. When agents run from CI, cron, or an orchestrator, every interactive escape hatch is gone — the human who would have approved a borderline action isn’t sitting at the terminal [291]. The unattended contract has five legs that must all hold at once: no interactive prompts (the run cannot stall waiting for a human who isn’t there), an explicit allow-list (everything outside it is denied, not asked), hard denies that survive every broad-permission mode (so a misfiring posture flag can’t widen the surface), bounded turns and tokens (so a confused loop terminates on its own budget), and a fail-closed handoff when ambiguity reaches files, data, or external systems (so the run aborts with a structured artifact instead of guessing). The dangerous shortcuts — --dangerously-skip-permissions, --auto, yolo modes — replace this contract with a blanket. The safer shape for unattended runs is a locked-down, no-prompt posture from the start:
claude -p "Run the failing tests and propose a minimal fix" \
--permission-mode dontAsk \
--allowedTools "Read,Glob,Grep,Bash(npm test),Bash(npm run lint)" \
--disallowedTools "Bash(git push *),Bash(curl *),AskUserQuestion" \
--max-turns 30Read this against the evaluation order the next section operationalizes. Hooks and permissions.deny[] rules are hard blocks: they fire first and hold in every mode, including bypass, which is why network exfiltration and shared-branch mutation belong in disallowedTools rather than relying on the model to abstain. dontAsk (or any equivalent no-prompt posture) hard-denies anything not on the allow-list, because there is no interactive callback to escalate to — the absence of a prompt is the enforcement, and excluding AskUserQuestion keeps the run from hanging on a question no one will answer. The turn cap plus a token or wall-clock budget enforced by the orchestrator bound a runaway loop before it bankrupts you or rewrites the database. If a specific task genuinely needs bypassPermissions or yolo mode, the only things still standing between the agent and your filesystem are the hard denies and the sandbox confinement covered in sandbox confinement — never deploy bypass without both. The fail-closed handoff that completes this contract — emit a structured needs-human-input artifact and exit non-zero rather than guess — is detailed in pause points. If your headless run can run forever, or can stall on a question no one will answer, you don’t have bounded autonomy — you have a debugging session with no debugger attached.
Autonomy bounded only by prompt instructions fails. Careful pre-tool hooks and elaborate slash-command suites layered on top of an otherwise unconstrained YOLO session mostly don’t stick: denies don’t work as advertised, automations go unused, and the apparent leverage costs you mental disengagement that quietly accumulates regressions [292]. Structural enforcement — what the rest of this chapter builds out — is what makes the contract real. The same lesson runs through narrow-scope agent design more broadly: scope and structured constraints are stronger levers than model upgrades [293].
16.2 The Permission Contract: Layered Architecture and Programmatic Enforcement
The permission system is where the autonomy contract becomes enforceable. Claude Code is the clearest public reference implementation, and its exact evaluation order should be treated as Claude-specific rather than universal [291]. Other tools expose comparable operator moves but not necessarily the same precedence model. The transferable principle is short: hard denies before broad autonomy, an explicit session posture, and an intercept layer for policy. Chapter 2 established the runtime sequence; this section operationalizes it as a contract you can write and ship.
In Claude Code, the order is load-bearing: hooks fire first and can intercept any tool call programmatically, returning allow / deny / ask, and they fire even in bypassPermissions mode [53]. Deny rules then block specific tools unconditionally and survive every other mode, including bypass. Permission mode sets the session-wide posture. Allow rules pre-approve specific tools. The interactive canUseTool callback handles anything left over.
| Mode | What auto-approves | What still asks or blocks | Common trap |
|---|---|---|---|
default |
Tools in allowedTools |
Everything else | Forgetting to commit allowedTools to the repo |
acceptEdits |
Edit, Write, plus mkdir, rm, mv, cp, sed |
Arbitrary Bash |
rm -rf proceeds without a prompt |
plan |
Nothing — read-only | All mutation | Rubber-stamping unread plans |
dontAsk |
Tools in allowedTools; everything else hard-denied |
No interactive callback | Headless runs that need clarification stall |
bypassPermissions |
Everything that reaches the mode step | Only hooks + deny rules can stop it | allowedTools is ignored as a constraint |
The most operationally dangerous misunderstanding is that allow-lists constrain bypass mode. They don’t. bypassPermissions approves every tool that reaches the mode step; the allow-list only matters when something would otherwise need to ask. If you need a hard block that survives bypass, use disallowedTools or a settings.json permissions.deny[] rule, both of which hold in every mode.
These layered controls show up in two distinct places — in code as SDK session options, and on disk as a committed policy file — and confusing the two is the most common configuration error. The file-format version uses the permissions schema with deny, allow, and ask arrays. A minimal .claude/settings.json for a CI deployment looks like this:
{
"permissions": {
"deny": ["Bash(git push *)", "Bash(rm -rf *)", "Bash(curl *)",
"Write(./.env)", "Write(./.env.*)"],
"allow": ["Read", "Glob", "Grep", "Bash(npm test)", "Bash(npm run lint)"],
"ask": ["Write(migrations/*)", "Edit(migrations/*)"]
}
}Commit the file and the policy travels with the repo; every contributor and every CI runner gets the same baseline. Settings load from layered sources, and the layering itself is configurable. The non-obvious caveat: managed policy settings (organization-deployed via MDM or equivalent) load regardless of the SDK’s settingSources setting, so a CI job that opts out of all on-disk sources may still be subject to enterprise overrides. OpenCode encodes the same lesson in a six-layer precedence stack — remote organizational, global user, custom env var, project, inline env var, managed admin — where managed admin config wins unconditionally and non-conflicting settings from all layers are merged rather than discarded [294]. The transferable rule across both: never assume your config is sovereign without checking the managed-policy layer.
16.3 Cross-tool comparison: three operator intents
The same three operator intents — hard-deny shared-branch mutation, ask before migration writes, intent-specific blocking via hook or policy — translate across tools, though not always at the same granularity. Claude Code, OpenCode, and Amp all expose path- or command-scoped rules with allow/ask/deny semantics. Gemini CLI’s documented controls are coarser: enterprise system settings allowlist tools (tools.core) and blocklist tools or commands (tools.exclude), with system overrides having final say; --approval-mode default|auto_edit|yolo selects per-edit confirmation, edit auto-approval, or full autonomy; --allowed-tools narrows the surface; --sandbox / GEMINI_SANDBOX adds OS-level confinement (yolo enables sandbox by default) [40]. There is no path-scoped ask rule in Gemini CLI, so the migration intent has to be expressed differently.
| Operator intent | Claude Code (.claude/settings.json) |
OpenCode (opencode.json) |
Gemini CLI (system settings + flags) | Amp (amp.permissions) |
|---|---|---|---|---|
| Hard-deny shared-branch mutation | "deny": ["Bash(git push *)"] [291] |
"bash": { "git push *": "deny" } [295] |
"tools": { "exclude": ["ShellTool(git push)"] } in system settings; system overrides win [40] |
reject Bash --cmd 'git push *' [34] |
| Ask/escalate on migration writes | "ask": ["Write(migrations/*)", "Edit(migrations/*)"] |
"edit": { "migrations/**": "ask" } |
No path-scoped ask exists. Either keep edit tools in --approval-mode default (every edit prompts) or tools.exclude write tools entirely for this workflow, then run a separate session for migration changes |
ask edit_file --path '**/migrations/**' (rules use allow/reject/ask/delegate) [34] |
| Intent-specific hook/policy blocking | PreToolUse hook returning deny [53] |
permission map keyed by tool name with shell globs [295] |
tools.exclude plus --sandbox for OS-level confinement; expressive policy lives in the system-settings file rather than a hook layer [40] |
Ordered rules; delegate --to <plugin> Bash --cmd '*' routes a class of calls to plugin logic that returns the action [34] |
The decision guide is the durable part, not the literal field spelling in every ecosystem: use a config file for standing policy that ships with the repo, a CLI flag to tighten or loosen for one task, a slash command or in-session toggle for mid-session escalation, and a hook script (or Amp delegate plugin) when you need expressive logic — “allow all tools, but reject git push and ask on anything that writes to migrations/” — that no static list can encode. Where a tool’s controls are coarser than your intent (Gemini CLI on path-scoped ask), split the workflow into separate sessions with different exclusion lists rather than pretending a finer-grained rule exists.
Hooks are the most expressive layer of the contract and the one most underused. A PreToolUse hook can inspect the target path before any write executes — not by parsing the model’s output, not by hoping the prompt held, but by intercepting the tool call before execution. The pattern that scales: exit code 2 plus a short stderr message that Claude reads and uses to adjust its plan; verbose explanations are less effective than short, actionable ones, and project-scoped hooks in .claude/ are more maintainable than global settings because they version with the repo [53]. A minimal Bash hook that blocks writes to .env paired with a PostToolUse audit logger:
#!/usr/bin/env bash
# .claude/hooks/block-env-writes.sh — matched on Write|Edit
TARGET=$(jq -r '.tool_input.file_path')
case "$TARGET" in
*.env|*/.env|*/.env.*)
echo '{"decision":"deny","reason":".env files are off-limits"}'; exit 0 ;;
esac
echo '{"decision":"allow"}'
# .claude/hooks/audit-writes.sh — matched on Write|Edit
TARGET=$(jq -r '.tool_input.file_path')
echo "$(date -u +%FT%TZ) $CLAUDE_SESSION_ID $TARGET" >> .claude/audit.logPair the PreToolUse deny with the permissions.deny[] rule above and the same intent is enforced twice — once declaratively in the file, once programmatically in the hook — so a misconfigured settings file or a clever model rewrite can’t sneak past both. The safety guarantee lives in the deterministic host system, not in the probabilistic model. For the broader hook lifecycle model and event taxonomy, see Chapter 10.
16.4 Sandbox Confinement: Where Tool Calls Can Land
Tool permissions and sandbox boundaries are not the same control. Tool permissions govern what the agent may try; the sandbox governs where those tries can land. A Bash(*) call the permission layer waved through can still hit a network the sandbox blocks, a filesystem the sandbox makes read-only, or a directory tree the sandbox doesn’t expose. Skip the second layer and a single misconfigured allow-list — or any of the bypass-mode footguns from the table above — is the only thing standing between your agent and your home directory.
The operator decision is not “use a sandbox” but “which confinement profile fits this loop.” Three profiles cover the cases this chapter cares about, each mapping to one of the autonomy postures established earlier:
| Profile | Filesystem | Network | Tool surface | Fits |
|---|---|---|---|---|
| Read-only analyst | Read-only over the repo | Off | Read, Glob, Grep only |
Documentation review, security audit, codebase exploration before plan approval |
| Workspace-write builder | Read-write inside the workspace; read-only outside | LAN-only or off | Above + Edit, Write, Bash(npm test\|pytest\|lint) |
Inner-loop implementation under your eye |
| Headless CI worker | Read-write inside an ephemeral checkout; nothing else writable | Egress allow-listed (registries, your CI control plane); no general internet | Above + project-specific build/test, no git push, no deploy |
Outer-loop unattended runs |
Four operator decisions turn a profile row into actual confinement, each chosen at the layer your platform enforces (host, container, or harness). Pin the writable root explicitly rather than inheriting cwd, so a Bash step cannot cd into a sibling repo or $HOME; for headless workers, run against an ephemeral checkout the orchestrator destroys at the end of the run. Make everything outside that root read-only — dotfiles, secrets directories, sibling repos, vendored dependency trees — so a misfiring Edit fails before it can corrupt anything. Treat network as a spectrum, not a switch: the analyst gets nothing outbound, the builder gets LAN-only for a local language server or test database, the CI worker gets an explicit allow-list of the registries it actually uses and the control plane it reports to. Align the tool surface with the sandbox: don’t allow Bash(curl:*) in a network-off profile or Edit in the read-only analyst row, because permission/sandbox contradictions surface mid-task as confusing tool errors and noisy retries.
The harness layer is the one this chapter can speak to with citations. Claude Code’s plan mode is a session-wide read-only state enforced by the harness, which the agent cannot bypass by choosing different tools. OpenCode’s built-in Plan agent (read-only) versus Build agent (full tools) is the read-only / workspace-write split expressed as a runtime mode you can pin per task, with max_iterations capping any autonomous build phase [152]. Gemini CLI exposes --sandbox / GEMINI_SANDBOX for OS-level confinement at the harness boundary, with yolo-mode runs sandboxed by default [40]. Harness controls are convenient and observable, but they are the inner ring; when stronger isolation is required, push the writable-root, read-only-outside, and egress allow-list decisions out to the host or container layer your platform team manages, using whatever mechanisms that layer documents.
The combination matters more than any single layer. A workspace-write profile with Bash(*) allowed but outbound network blocked at the egress boundary is the right shape for inner-loop implementation: the agent can edit, build, and test, but it cannot exfiltrate secrets it stumbles across or pull arbitrary code off the public internet. A read-only profile with full network is the right shape for an analyst that needs to read API docs but must not modify the repo. Mixing them up — read-only filesystem with write-everything-network, or workspace-write with full outbound — is how teams discover the hard way that “it can only run tests” turns out to also mean “it can post your .env to a paste site.”
16.5 Pause Points: Where Humans Stay in the Loop
Pause points are the half of the contract that makes the agent stop before a decision crosses your risk boundary. Use three mechanisms deliberately. Plan mode pauses the whole task before mutation. Tool approval pauses a specific action, such as a shell command or write to a risky path. Question tools ask for missing human context when requirements fork.
A compact example shows the difference. In an attended migration session, start in plan mode: the agent may read the repo and propose implementation_plan.md, but Edit, Write, migration runners, and git push are denied. After review, the human releases only the first slice into act mode: approved_plan_id=auth-jwt-v2; allow Edit(src/middleware/auth.ts); allow Bash(npm test -- auth.middleware.test.ts). A proposed schema migration still asks because it sits outside the released slice. If the agent needs to know whether existing sessions must remain valid, it uses AskUserQuestion in the attended session. If there is no registered handler, the same call would hang or error; that is not a safe pause.
The unattended version must fail closed instead of asking the void. Either register a response handler that routes questions to a queue, or remove question tools and require a handoff artifact:
{
"status": "needs-human-input",
"question": "Should existing session cookies remain valid after JWT rollout?",
"safe_state": "No files changed; implementation stopped before migration, commit, push, deploy, or merge."
}The runner exits non-zero with a distinct code, attaches the artifact to the job or PR, and waits for a human. Retrying the same headless run without new input is a loop bug, not resilience.
A useful pause policy is short enough to audit: ambiguity about requirements becomes a question or handoff artifact; mutation outside the approved plan asks; shared-branch, deployment, credential, and production-data actions deny by default; and any autonomous loop has a retry, token, and wall-clock cap. Broad auto-approval or bypass breaks the gate when it lets writes proceed before the approval record exists.
Delegation adds one extra rule: make least privilege structural, not behavioral. A read-only reviewer role should be unable to edit by construction:
---
name: doc-reviewer
description: Reads markdown and code, reports inconsistencies. Never writes.
tools: [Read, Grep]
permissionMode: dontAsk
---
Review documentation against implementation. Return a report only.That role has a smaller blast radius than a general subagent told in prose to avoid edits. The detailed role-design question belongs to Chapter 15 and Chapter 4; the permission-design rule here is that the harness should make unsafe actions unavailable.
16.6 Boundary Control: Two Planes for Context and Tool Access
Boundary control runs on two complementary planes: file-content exclusion controls what the agent can see; tool-access governance controls where it can act. You need both because excluding a file from context does not stop an external tool from fetching related data through another route.
The file-content plane keeps secrets, generated artifacts, customer dumps, and regulated paths outside model context with .gitignore, tool-specific ignore files, workspace excludes, and admin-managed context filters. Sourcegraph Cody’s documented filters show the durable shape: administrators declare include-only, exclude-only, or combined rules using RE2 patterns, and developer-maintained excludes such as .gitignore, files.exclude, and search.exclude also matter [26]. The trade-off is operational: safe exclusions that break daily workflow will be loosened within a sprint, so validate both protection and usability.
The tool-access plane governs MCP servers, connectors, shell commands, and external APIs. A file rule can keep .env out of context, but it cannot stop a generic database tool, web fetcher, or wiki connector from moving sensitive information elsewhere. Amazon Q Developer’s MCP governance is a useful reference because it separates coarse controls, such as org-level enablement, from fine controls, such as approved server registries and version-pinned entries [92]. The caveat is just as important: client-side governance can be bypassed by a sophisticated user, so high-assurance setups need server-side authorization, network egress controls, and immutable audit logs.
Use one policy identity for both planes. If a migration-assistant may inspect schema but not customer data, its context rules should exclude billing models, seed dumps, credentials, and .env*, while its tool rules should allow only read-only schema inspection and PR-commenting tools. Do not let the same role reach a writable database, generic web fetch, or Confluence connector. Pin approved tool versions so an upgrade cannot silently expand the surface.
| Plane | Mechanism | Failure prevented |
|---|---|---|
| File-content exclusion | .gitignore, ignore files, workspace excludes, admin-managed context filters [26] |
Secrets and regulated data flowing into model context |
| Tool-access governance | MCP toggles, version-pinned server allow-lists, per-role tool-name scoping [92] | Agent invoking external systems to read or exfiltrate data the file plane never saw |
| Server-side enforcement | Auth proxies, egress policy, immutable logs, identity-bound API keys | Failure when client-side controls are bypassed, misconfigured, or compromised |
16.7 Recovery as a Design Decision: Building Reversibility Into the Contract
Every bounded-autonomy contract is a guess, and some guesses will be wrong. The permission-design question is not how to diagnose a drifted session after the fact — that’s the territory of Chapter 17. The permission-design question is how to structure the contract so that being wrong is cheap.
Circuit breakers are an autonomy contract decision, not a debugging tactic. Stripe’s Minions cap CI failure cycles at two before handing off to a human [296]. The cap isn’t arbitrary — it’s a recognition that agents repeat failed approaches when their context contains the failure but not why it failed [289]. Without an explicit circuit breaker, autonomous loops cascade: a bad fix introduces a worse bug, which the agent attempts to fix with a more aggressive change, which corrupts more state. Cap the retries before you start the loop, not after. The cap belongs in the same configuration that declares allowedTools and --max-turns — a stop condition is a permission, just expressed in the budget dimension instead of the capability dimension.
Commit boundaries are the structural recovery mechanism. The Ralph pattern — an autonomous loop where each iteration starts a fresh context window with a small task pulled from a PRD and a progress file committed to git — formalizes this discipline [297]. The point for permission design is not the Ralph loop itself but the principle: putting commit boundaries between work units means any iteration can be reverted, replayed, or skipped without unwinding the whole session. Agents should commit on every successful unit, not at the end of a long run, because the cost of revert grows with the diff size. Smaller PRD items also produce higher-quality output in autonomous loops because each iteration runs on a fresher, less-filled context window [297], and deliberate context engineering throughout an agent’s trajectory matters more than a static one-time setup [298] — so capping unit size is itself a recovery decision.
The hardest recovery is cognitive, not technical. Long autonomous sessions enable mental disengagement, and the engineer who stopped thinking critically halfway through won’t notice the regressions accumulating in the commits [292]. The session length you can sustain is bounded by what your review capacity can absorb in one sitting, not by how long the agent can keep generating — verification, not generation, is the actual throughput constraint [289]. Skip this and the failure mode is consistent: the code looks plausible, the tests pass, and three weeks later the bug surfaces in production with a commit history that says “agent did it” and no human who can explain why.
The reversible-autonomy checklist that ties the chapter together is short. A bounded-autonomy contract is ready to ship when all four are present:
- Circuit breaker. Maximum retry count before the loop hands off to a human, declared up front [296].
- Commit boundary. Every successful unit of work is committed before the next unit starts, so revert size never exceeds one iteration.
- Fail-closed handoff. When ambiguity reaches files, data, schema, or external systems, the agent emits a structured
needs-human-inputresult and exits non-zero rather than guessing — and stops before commit, push, migration, or deploy if work is already in flight. - Budget caps on all three axes. Turns (
--max-turns), tokens, and wall-clock time, each set high enough for the task and low enough that a confused loop terminates before it touches anything external.
Get those four right and the framing stops being “how do I prevent failure” — failure is going to happen — and starts being “how do I make failure cheap to detect and cheap to reverse.” When the contract still fails despite all four, Chapter 17 covers the in-session repair moves. Get the contract right and you can run agents harder than seems prudent. Get it wrong and the only autonomy left to you is the kind where you’re watching every keystroke — exactly the leverage you adopted the agent to escape.
16.8 Takeaways
- Write separate permission profiles for inner-loop and outer-loop work: the inner-loop profile permits reads, edits, and test runs but hard-denies
git push, deploy commands, and migration runners; the outer-loop profile requires explicit approval before any shared-branch mutation. - Put irreversible risks like network exfiltration and shared-branch mutation behind hard-deny controls that survive broad-autonomy modes; do not rely on allow-lists as your final safety boundary.
- Commit your permissions policy file (e.g.,
.claude/settings.json) to the repository so every contributor and every CI runner inherits the same baseline without relying on local configuration. - For sensitive paths, enforce the block twice: a static deny rule in the repo policy and a pre-tool hook, so a config mistake or model rewrite cannot bypass both.
- Match each permission profile to a confinement profile: pin the writable root, make everything else read-only, scope network egress to the task, and align the tool surface with the sandbox so permissions are not your only boundary.
- Design unattended agent runs to emit a structured
needs-human-inputartifact and exit non-zero when the agent hits ambiguity, rather than stalling on a question no one will answer or guessing; retrying the same run without new input is a loop bug, not resilience. - Treat boundary control as two planes: file-content exclusion controls what enters model context, while tool-access governance controls which external tools, connectors, and APIs can fetch or mutate related data.
- Before launching any autonomous run, verify all four reversibility elements are present: a retry circuit breaker declared up front, commit boundaries between work units so any iteration is independently revertable, a fail-closed handoff artifact when ambiguity reaches files or external systems, and budget caps on turns, tokens, and wall-clock time.