5  Context Engineering and Session Control

Context is what the agent sees. Session control is what survives between sessions, what gets refreshed, and when. Both are operator decisions, not features.

5.1 Context Is an Operational Control System

The skill that matters most when working with coding agents is not writing clever prompts. It is assembling the right information — instructions, code, documentation, tool definitions, conversation history — into a coherent package that gives the agent what it needs to succeed. This book uses four context terms deliberately. Context window is the model and harness limit: how much text can be present at once. Context budget is your allocation decision inside that limit: which files, rules, tools, and history deserve space. Context engineering is the practice of managing that allocation over time. Context rot is stale, partial, or misleading working context that still looks plausible to the agent. That is why the field increasingly talks about context engineering instead of prompt engineering: the hard part is not wording a clever request once, but feeding the model the right working set over time [83], [84].

But filling the window is only half the job. The other half is what the agent carries between sessions, what gets pre-loaded before the first turn, what activates conditionally, and when to throw the whole thing away and start over. A coding agent has vast general knowledge and zero memory of your specific codebase, and it loses its working memory after each session. The practitioner who treats every session as a fresh slate spends an hour every morning re-explaining the same project conventions. The practitioner who never resets pushes through degradation until the agent forgets files it was just reading. Neither extreme is the answer. The job is to manage what enters the window, what persists across sessions, when to compact, when to restart, when to fork, and how to prevent context rot — stale, partial, or misleading working context that still looks plausible to the agent. That is an operational control system, and it is yours to operate.

A useful mental model: three different kinds of content compete for the same finite window, and they have different ownership and different lifetimes. Standing context is what the harness pre-loads before the first turn — instruction files like CLAUDE.md or AGENTS.md, project conventions, agent-generated memory entries marked always-on. Conditionally-injected context is what activates only when the work matches a trigger — a file glob, an explicit @-mention, the model’s own judgment. Per-turn working context is the live conversation, the files you read this turn, the tool outputs you just received. Each layer has its own controls. Most of what goes wrong with context comes from confusing them.

Call the failure mode context rot: the agent’s working set still looks full, but the useful state inside it has gone stale, partial, or misleading. Old tool output outweighs the current task, a compacted summary drops the decision that mattered, memory carries a fact from before a migration, or a broad retrieval result makes the wrong module feel relevant. Context rot is not the same as missing context. Missing context means the agent never saw the right information; context rot means it saw information that now points it in the wrong direction. Chapter 17 uses this term as a diagnostic category, but the operating controls live here: prune, checkpoint, reset, fork, and keep durable facts outside the chat stream.

Named failure mode: context rot. Stale, partial, or misleading working context that still looks plausible to the agent. The fix is not “add more context”; it is to remove stale context, externalize durable state, checkpoint, reset, or fork before the polluted working set drives more decisions.

The shift this chapter asks of you: stop thinking of context as something the harness manages on your behalf. Start treating it like a budget you administer. Decide what enters, what stays out, what survives the next reset, and what does not.

5.2 Context Window Budget Management

Context window budget management starts with one fact: every coding agent operates within a finite context window. Everything the agent knows during a session lives there: system prompt, instruction files, conversation so far, every file read, every tool output, every tool definition. When this window fills, something has to give.

Advertised context windows keep growing, but effective capacity has not kept pace. A safer practitioner heuristic is to treat 60-80% utilization as your effective ceiling, not the marketing ceiling. More tokens do not automatically mean more useful attention; every extra file competes with the reasoning budget for the next step [85], [86].

Knowing where your tokens go is the prerequisite to managing them. Claude Code exposes this directly: /context shows a per-component breakdown — system prompt, custom instructions, MCP tool definitions, conversation history, files in scope. A single MCP server can add tens of thousands of tokens of tool schemas before you type a word. Auto-generated AGENTS.md files cost you on every session forever. Run /context before you assume you are short on room because the conversation got long; usually a meaningful fraction of the window was committed before the conversation started.

The practical discipline: stay under 60–80% utilization. Beyond that you are in the degradation zone, and “more headroom” is the highest-leverage thing you can buy. Audit standing context first — it is the cheapest to trim because it costs you on every session. Then look at what you have pinned this turn. Only after both of those should you consider compaction or restart.

The cost math is blunt. A 200k-token request full of irrelevant vendored code costs the same input-token spend as a 200k-token request full of the three files you actually needed. The model does not discount noise for you. Three trigger signals justify intervention immediately: the window climbs past roughly 70–80% capacity, the session cost rises while the task is no longer getting clearer, or the agent starts repeating earlier decisions instead of extending them. Numeric caps are the harness-level version of the same discipline: tools such as Aider expose budget_tokens in configuration, and adjacent ecosystems expose env-var caps like GOOSE_MAX_TOKENS, so the budget ceiling can live in version-controlled settings instead of only in your head [87], [85].

The operator toolkit is more concrete than “watch the window”: - active prune: remove stale files from Aider’s /add set, drop stale pins, or switch to a narrower symbol/line-range target; - hard caps: use budget_tokens or similar env-var ceilings so the session cannot keep accreting context invisibly; - no-context mode: when the question is generic, strip repo chips entirely instead of paying for irrelevant recall; - compaction as fallback: /compact preserves the harness’s summary of the thread, not your exact reasoning path, so use it only after you have already pruned what you can.

5.3 Precise Context Targeting

Precise context targeting means choosing the smallest slice of context that still lets the agent act correctly. You do not have to choose between “include the whole file” and “include nothing.” Modern agents expose a precision spectrum, and the operator move is to drop one level down before concluding the agent is broken.

At the top: a directory or whole repository. Useful for genuine exploration when you do not yet know where the relevant code lives. At the next level: a whole file via @filename or its equivalent. Useful when you genuinely need the surrounding context. Below that: a line range — Cody lets you write @src/auth/session.ts:45-72 and inject exactly the function in scope without the rest of the file [88]. Below that: a symbol — @#SessionManager injects the class definition by name [88]. At the bottom: nothing. No context chips at all. The right move when the question has nothing to do with your codebase — a language feature, a standard library API, a question about a framework you have not adopted yet [89].

For long sessions, Aider’s /add and /drop make context management explicit between turns. The agent only sees what is currently in the /add list. When a file becomes irrelevant, drop it. When you need to look at a different module, add it. The model is operating on a curated set you maintain by hand. This sounds tedious; in practice it is what separates productive long sessions from ones that drown in their own context [90], [91].

The practical rule: if the agent keeps referencing the wrong function in a large file, do not reload the whole file with better framing — pin the line range. If it keeps confusing two similarly named classes, inject the symbol definition. If it keeps producing project-specific advice for a generic question, strip the chips. Each level down is a quality improvement and a cost reduction simultaneously.

One compact vignette makes the spectrum concrete. Suppose billing.ts contains both invoice rendering and payment-retry logic. Whole-file context causes the agent to keep proposing invoice-template edits when you are only debugging retry backoff. The next move is not to “prompt better.” It is to pin the retry function by line range or symbol so the working set excludes the invoice code. And if the task becomes “how does this SDK normally implement exponential backoff?”, go one level lower again: drop the project chips and ask the generic API question with no repo context. The operator skill is moving down the precision ladder before the session drifts.

The tradeoff to understand: pinned context displaces auto-retrieved context only at relative priority, not absolutely. A @filepath:1-10 pin tells the agent “this slice matters most”; it does not tell the index “stop pulling related files.” If you need only what you pinned, you also need a no-context mode or a narrow exclusion policy. And too many @-mentions exhaust the window — every tool eventually surfaces a “file too large” or “too many references” error. Accumulating mentions is the same failure mode as loading irrelevant files, just more deliberate.

Two targeting failures are worth memorizing. First, stale indexing: @#symbol only works if the tool’s index has processed the file you are naming, so a missing symbol is sometimes an indexing problem, not a retrieval problem [88]. Second, policy beats targeting: if a file is excluded by .clineignore, files.exclude, or cody.contextFilters, an @-mention will not rescue it. Precision targeting sits below boundary policy, not above it.

5.4 Context Exclusion and Boundary Control

Boundary control is the contract that says what may enter context and what may not. .gitignore helps, but it is not enough on its own: workspace settings, agent-specific ignore files, and enterprise retrieval filters can all widen or narrow the boundary independently [88], [26].

Use the boundary recipe in this order: 1. Name the paths that must never reach the model: secrets, generated artifacts, vendored code, large datasets. 2. Encode them at the right layer: files.exclude and search.exclude for the workspace and index, .clineignore for the agent session, cody.contextFilters for enterprise retrieval policy. 3. Choose the policy mode deliberately: include-only for a narrow allowlist, exclude-only for a broad default with a denylist, combined when an allowlist still needs local carve-outs. 4. Validate the boundary: keep workspace settings aligned with .gitignore, check RE2 syntax in cody.contextFilters, and remember that Cody’s exclude rules can disable Prompts [26]. 5. Separate file exclusion from tool governance. If the risk is outbound access, use a tool allow-list or permission gate rather than pretending file filters solve it [92].

One worked example makes the modes concrete. Use include-only for a narrow allowlist such as ^github\\.com\\/mycompany\\/payments-.*. Use exclude-only when the default is broad but paths like vendor/ or dist/ should never enter retrieval. Use combined when src/ is generally allowed but src/generated/ still has to stay out. In RE2, src/generated.* is broader than src/generated/.*, so a small pattern mistake can silently widen or narrow the boundary [26].

Plane Control Operator question Common failure
Workspace/index .gitignore, files.exclude, search.exclude, cody.contextFilters can this path be indexed or retrieved at all? drift between .gitignore and workspace settings; malformed RE2
Agent session .clineignore can this task ever read this path? assuming Git exclusions already cover the agent
Tool access org allow/deny lists, MCP permissions, sandbox file controls may the agent leave the repo and touch external systems? broad default allow-lists that make exfiltration possible

The mechanical rule is simple: exclusion wins over inclusion. If a path is excluded by .clineignore, files.exclude, or cody.contextFilters, an @-mention does not rescue it. Change the policy first; do not work around it.

One compact example covers both planes. If secrets/** and vendor/** should never enter the model, exclude them in the file/context layer. If the same task also has access to a deployment-log MCP server, govern that separately with an org or account allow-list. The first control protects the model’s working set. The second limits what the agent can do outside the repo [93], [94]. Cody’s retrieval stack makes the performance benefit visible too: aligned exclusions speed file lookup because irrelevant material never enters the index in the first place [88].

5.5 Persistent Session Memory and the Per-Turn Working Set

Persistent context is a routing problem: decide what survives the session, where it lives, and how it activates. The important distinction is between human-authored instruction files, agent-authored memory, reusable procedures, and this turn’s working set [43], [95].

Use this routing rule before you store anything: 1. Team-owned invariant: put it in a repo instruction file such as AGENTS.md or a rules file. 2. Temporary project fact that should persist across sessions: put it in workspace memory. 3. Cross-project personal default: put it in user memory only if it is genuinely generic. 4. Repeatable procedure: make it a workflow or skill, not a free-text memory entry. 5. Turn-local evidence: keep it in the session or in a handoff note; do not promote it to persistent memory unless it must survive.

Store Best use Activation Main failure
Repo instruction file team-owned conventions and invariants always_on or fileMatch every session pays for text that should have been conditional
Workspace memory project facts that may change but should survive restarts loads with the workspace stale project facts survive migrations
User memory genuinely generic personal defaults loads across workspaces cross-project bleed
Workflow or skill reusable procedures manual, fileMatch, or model_decision expensive procedure fires implicitly or too late

The activation contract is part of the routing decision. always_on is for conventions you are willing to pay for every session. fileMatch is the default for project-specific guidance because it is scoped and reproducible. model_decision is useful, but it is harder to debug because the trigger is not deterministic. Manual invocation is the right choice for expensive or risky procedures.

One example is enough. “Use pnpm in this repo” belongs in a repo instruction file because the team owns it. “legacy-auth is still live until Friday” belongs in workspace memory because it is temporary project state. “I prefer terse diffs” is user memory only if you want that preference everywhere. “Run the billing review checklist before edits” is a workflow or skill. If the same fact lands in all four stores, the agent is not well-instructed; it is overdetermined.

Standing context is not free. ETH Zurich found that auto-generated AGENTS.md files increased token costs materially while not reliably improving outcomes, so reserve always-on text for true invariants and push everything else toward fileMatch or manual activation [38]. The mechanics of writing good rule files belong in Chapter 3; the point here is where each kind of durable context belongs.

5.6 Memory and the Staleness Problem

Persistent memory turns into stale memory unless someone maintains it. Once an agent-authored entry exists, it keeps steering later sessions until you delete or rewrite it. That is why a renamed API or abandoned library can look like model confusion when the real problem is old memory still being treated as truth [95], [43].

Use a short maintenance routine: - Before a new project phase, scan workspace memory for facts that may have expired. - After a migration or architecture change, delete or rewrite any memory that encodes the old shape. - Before sharing workspace memory with teammates, review it like an unreviewed config change. - If behavior goes wrong, debug in order: repo instruction files first, workspace memory second, user memory last.

The canonical stale-memory bug is simple: the repo moved from npm to pnpm, but workspace memory still says “use npm install.” Every fresh session follows the stale instruction faithfully until you remove it. The fix is memory hygiene, not a better prompt [46].

The other persistent failure is cross-project bleed. User memory loads everywhere, so a project-specific preference stored there will contaminate unrelated repos. Keep cross-project memory genuinely generic. Everything else belongs at workspace scope or in human-authored rules.

One debugging vignette ties the stores together. If the agent keeps suggesting npm install in this repo while also defaulting to Redux in a new Vue project, check repo rules first: if AGENTS.md still says npm, the bug is team-owned and should be fixed there. If repo rules are correct, inspect workspace memory next: a stale “use npm” entry belongs there after a migration. If the Vue project still picks Redux after workspace memory is clean, inspect user memory last: that is where cross-project bleed lives. The order matters because each store has a different owner and a different blast radius.

5.7 Session Reset and Context Checkpointing

Checkpointing is the precondition for reset and resume. Before you fork or reset, externalize three things in a short handoff note: current state, open questions, and key decisions. That note is the durable checkpoint. The session itself is not [24], [96].

Then apply the lifecycle contract: - Prune when the task is still valid but the working set is crowded. - Fork before a risky branch so the parent session stays clean. - Reset when assumptions are rotten or the task boundary changed. - Resume only when the work is still valid, the session_id exists, and the cwd or workspace identity still matches.

Move Use when Carry forward Common failure
Prune token pressure, stale pins, irrelevant files in scope only the still-relevant context trying /compact before dropping obvious noise
Fork speculative migration, uncertain refactor, alternate design checkpoint note plus clean parent session contaminating the parent by experimenting in place
Reset repeated loops, ignored instructions, changed task boundary handoff note only restarting without externalizing the decisions that still matter
Resume crash, pause, meeting interruption saved session_id plus matching cwd/workspace resuming against the wrong repo state and trusting the result

5.8 A compression playbook

Compression is not one move; it is four, and they compress different things. Pick the one that preserves what you actually need and discards what you can afford to lose.

Technique What it preserves What it discards When to reach for it Failure mode it prevents
Active prune (operator) the files, pins, and threads still load-bearing for this turn stale /add entries, irrelevant chips, exhausted @-mentions first response to token pressure, before any summarization paying for noise the model has to ignore
Harness compaction (/compact) the harness’s summary of the thread so the session can keep going exact reasoning steps, tool-output detail, rejected paths and the why behind them thread is still valuable but the window is full and pruning is exhausted hitting a hard window limit mid-task with nothing externalized
Operator handoff note / checkpoint current state, open questions, key decisions with rationale, in your words live conversation, tool transcripts, working-set ordering before fork, reset, or any restart you control; before a known interruption losing the why behind decisions when the thread dies or rots
Promote to durable memory/rules invariants the next session should also have: conventions, project facts, procedures turn-local evidence, one-off debugging state, anything that will expire once a fact has survived two or three sessions and is still true re-deriving the same convention every morning, or letting it live only in chat

The decision order matches the cost order. Prune first because it is free and reversible. Compact second because it keeps the thread alive but costs you fidelity. Checkpoint third because it costs two minutes of writing and survives any reset. Promote last, and only deliberately — durable memory is the most expensive store to get wrong, because stale entries silently steer every future session (see Section 5.6).

Three traps to avoid. Reaching for /compact before pruning bakes the noise into the summary. Treating compaction as a checkpoint loses the rationale behind decisions, which is exactly the part that degrades fastest. Promoting turn-local facts into workspace or user memory turns today’s debugging state into tomorrow’s stale-memory bug.

The tool surfaces line up with the same contract. Aider’s /drop and token caps are prune controls [91], [87]. OpenCode’s --fork is an explicit branch surface [97]. Claude Code /resume, OpenCode --continue, and SDK session resume all depend on a valid saved session and a matching workspace [24]. Cline’s checkpoint model is useful because it separates conversation state from file state, which gives you a deliberate restore point before you reach for /newtask [96], [98].

/compact is not a fifth lifecycle move. It is the harness fallback when the thread still has value but the window is full. Use it after pruning, not instead of pruning.

5.9 Stateful vs. Stateless Session Modes

Choose session mode before you build the workflow. Stateless is the default when history would only contaminate the answer. Stateful is justified only when continuity is an asset [24], [23].

The design choice collapses into a session-lifetime matrix. Four shapes cover almost everything; the rest of the section is just the trigger signals and the failure modes for each.

Lifetime shape Representative surface Lives for Trigger signal Silent failure
Stateless one-shot call SDK call with no session, CI step, nightly review, scheduled lint pass one request “fresh attention is worth more than history” someone assumes the run can be continued later when nothing was saved
In-process session client interactive Claude Code or OpenCode session, long-running SDK client holding the thread one process “the thread is hot and the process is still alive” silent context bloat as the in-memory thread grows past the effective ceiling
Captured session_id resume Claude Code /resume, OpenCode --continue, SDK session resume across processes, until invalidated “the work is still valid and I have the session_id cwd/workspace mismatch: session restores cleanly against the wrong repo state and the operator trusts the result
Forked branch OpenCode --fork, Cline new task from checkpoint a child session, parent untouched “I need a risky alternative without contaminating the parent” branch work merged back implicitly, polluting the parent
Explicit statelessness SDK persistSession: false, deliberately non-resumable jobs one request, by contract “resumability is not a requirement and I want that documented” future readers assume the run is resumable because the surface looks like a session

The matrix is also the operator playbook. A stateless one-shot call is the right default for nightly CI and bounded research questions; preserving state on those runs only invites stale history into clean answers. An in-process session client is the right shape for an interactive coding session — the live thread is the working memory — but only inside one process, and only as long as you watch the window. Captured session_id resume is the right shape for interrupted investigations and migrations, with one hard precondition: you must come back to the same cwd and workspace identity you left from. Forked branches are the right shape for any experiment that might fail noisily; fork before the first destructive step, not after. Explicit statelessness is the right shape when the job is deliberately ephemeral; saying so with a surface like persistSession: false is how you signal that to the next operator.

A short vignette makes the build-then-verify split concrete. You spend an hour in an in-process session client implementing a new rate-limiter: the thread accumulates the design choices you rejected, the edge cases you uncovered, the partial refactor of the middleware you had to do along the way. That continuity is load-bearing — restarting mid-build would cost you the rationale, and this is exactly the case the in-process row of the matrix is for. But once the implementation lands, asking the same session to write the test suite or review its own diff is the wrong move: it will rationalize its own choices, skip the cases it already convinced itself were unreachable, and inherit every assumption baked into the thread. The fix is to switch surfaces — open a fresh stateless one-shot call against the diff alone, with no carryover, and have it generate tests or a code review from scratch. Context isolation is what makes the second pass an actual independent check rather than a continuation of the first.

The decision rule is explicit: use stateful continuity while you are still building and history is an asset; switch to a fresh stateless session the moment you need independent scrutiny rather than more continuity. Review and test generation are the canonical cases, because both depend on context isolation for their integrity — a reviewer who shares the author’s working memory is not really a reviewer. The captured session_id is fine to keep around for resuming the build later; it just should not be the session that grades the build.

The cwd/workspace mismatch deserves the spotlight because it is the canonical silent restore failure. Resume succeeds. The transcript looks right. The agent picks up where it left off. But the repo is on a different branch, or the working tree has uncommitted changes that contradict the resumed plan, or the session_id was captured in a sibling worktree. There is no error — just confidently wrong work. The defensive move is mechanical: if you cannot guarantee the same cwd and workspace identity later, do not rely on resume. Start fresh with a handoff note instead [24], [23].

The other operator moves follow from the same matrix. If the job is a scheduled review or CI pass, do not preserve state just because the tool can. If the experiment may fail noisily, fork before the first destructive step. If the workflow is explicitly ephemeral, say so with surfaces like persistSession: false instead of assuming future readers will infer it [24], [23].

Cline’s task guidance points to the same design rule from the UI side: one task, one goal, and a new task when the goal changes [98]. For the runtime mechanics behind these surfaces, see Chapter 2.

5.10 Context Rot and the Case for Fresh Sessions

Context rot is the gradual degradation of agent quality over the course of a long session. This is not a bug in one tool. It is the structural consequence of long attention chains: once conversation history becomes the main thing the model is reasoning over, quality degrades even if the repo state itself has not changed [99], [100].

The symptoms are distinctive. The agent forgets files it was just reading, re-proposes rejected paths, ignores rules, or declares the work done when it is not. Most insidiously, reasoning behind decisions degrades faster than the decisions themselves: the agent follows established patterns but loses the rationale [101], [100].

The most reliable response is often the hardest one to accept: start a new session. That only works if your decisions and progress survive the boundary. The principle is context anchoring — externalizing anything the agent needs to remember into files that persist outside the conversation.

A practical rule: once the session starts repeating itself, re-proposing rejected paths, or carrying more open threads than you can summarize in a short handoff note, start fresh. Small tasks with clean context routinely outperform larger tasks carried forward on accumulated history, because the agent can reason locally instead of fighting stale assumptions [99], [100].

When you do restart, a brief handoff note makes the new session productive immediately. Capture three things: current state, open questions, and key decisions with rationale. Here is what an effective session anchor looks like:

Cart API refactor — handoff
- Extracted cart logic into CartService (src/services/cart.ts).
  Unit tests pass; E2E add-to-cart failing on webhook mock timeout.
- Open: Should CartService own inventory checks?
  Webhook mock hangs; consider switching to msw.
- Decided: class-based service (needs shared DB pool),
  server-side-only pricing (security; see PR #247).

Two minutes to write. Saves twenty minutes of re-explaining. The agent gets the what, the why, and the open threads — exactly the context that compaction would have destroyed.

The deeper pattern is simple: context is a depreciating asset. Externalize before you restart. Restart before you compact. Compact before you push through. When a reset solves a coherence problem, you are still in this chapter; when the same reset is primarily about token spend or cache behavior, you have crossed into Chapter 18.


5.11 Cross-References

  • Chapter 3 — concrete mechanics of CLAUDE.md / AGENTS.md / .windsurf/rules/, fileMatch glob syntax, conditional activation rules, team-review workflow for rules files. This chapter positions instruction files in the layering model; that chapter teaches you how to write good ones.
  • Chapter 2 — the harness internals that implement compaction, session persistence, cwd handling, and the rest of the lifecycle machinery. This chapter teaches you the operator surface; that chapter teaches you what is happening underneath.
  • Chapter 16 — tool-access governance, MCP toggles, permission scoping. The other plane of the exclusion problem (data exfiltration through tools, not data leakage into context).
  • Chapter 21 — the enterprise context-exclusion controls (cody.contextFilters as compliance infrastructure, audit-log shipping, organization-level rule distribution) that turn the boundary into an admin-managed policy.
  • Chapter 18 — the cost framing of every move in this chapter. Read together with this chapter, not separately.
  • Chapter 17 — the diagnostic workflow when you cannot tell whether you are looking at context rot, a memory store gone stale, or a real model failure. The triage logic depends on the layering model this chapter establishes.

5.12 Takeaways

  • Run /context to audit per-component token consumption before assuming the context window is full from conversation alone — a single MCP server or auto-generated instruction file may be consuming a large fraction of the budget before you type a word.
  • Before concluding the agent is producing poor output, drop one level down the precision ladder — from whole file to line range, from line range to symbol, or from any project context to no context for generic questions — rather than re-prompting at the same level.
  • Encode file exclusions at each layer separately: files.exclude and search.exclude for workspace indexing, .clineignore for the agent session, and cody.contextFilters for enterprise retrieval — .gitignore alone does not cover all planes.
  • Route each piece of durable context to the right store before writing it: team-owned invariants go in the repo instruction file, temporary project facts go in workspace memory, genuinely generic personal defaults go in user memory, and repeatable procedures become workflows or skills rather than free-text memory entries.
  • After a migration or architecture change, scan workspace memory and delete or rewrite any entries that encode the old shape — stale memory silently steers every future session until it is removed.
  • Before forking or resetting a session, write a three-part handoff note — current state, open questions, and key decisions with rationale — and treat that note as the durable checkpoint, not the session transcript itself.
  • Switch to a fresh stateless session for code review and test generation even when you have a live stateful session — a reviewer who shares the author’s working memory is not an independent check.
  • When a session starts repeating itself or re-proposing rejected paths, start a fresh session with a handoff note rather than reaching for /compact — small tasks with clean context outperform larger tasks carried forward on accumulated history.