3 Persistent Instructions and Rules

The most impactful thing you can do with a coding agent takes less than sixty lines of markdown — and most practitioners write ten times that before learning why less is more.

3.1 What Rules Files Do

Every serious coding agent has a standing-instructions surface. The filename changes — CLAUDE.md, AGENTS.md, GEMINI.md, .cursorrules, or Cursor’s path-scoped Markdown rule files — but the pattern does not [33], [34], [35]. These files preload the expectations that should hold before you type the first prompt. The durable lesson is not which product picked which filename. It is that serious agent work needs a reviewed place for standing orders.

These files are not documentation. They are not READMEs for humans. They are onboarding briefs for a stateless colleague who forgets everything between sessions [36], [37]. This chapter uses rules file for the concrete files and persistent instructions for the broader concept: standing guidance that the harness loads before or during a session. Context is the job — as Chapter 5 covers in depth — and rules files are the single highest-leverage point for getting context right, because they inject into every conversation automatically. Cross-tool convergence on AGENTS.md as a shared filename means a well-written rules file benefits you regardless of which agent you sit down with [38], [34].

What belongs in a rules file? Three categories. First, the what: your tech stack, project structure, and what each component does — especially in monorepos where the agent can’t infer boundaries from file names alone [38]. Second, the why: the purpose of the project and its key parts, so the agent understands intent rather than just structure. Third, the how: build commands, test commands, and verification steps. In Schmid’s AGENTS.md benchmark, tools explicitly mentioned in the rules file were used 160 times more often than unmentioned tools [38]. Treat that as a strong directional result from one benchmark, not a universal constant. If you use pnpm instead of npm, or uv instead of pip, say so — otherwise the agent will default to the more common tool and waste your time debugging package manager conflicts. Throwaway prototypes and scratch repos don’t need one; if you won’t return to the repo more than once a week, the briefing tax of maintaining a rules file isn’t worth it.

3.2 Three Levels of Scope

Effective rules files split across three levels of ownership: user-level personal preferences, repository-level team conventions, and agent-specific persona overrides. Claude Code loads rules from four physical scopes — enterprise, user (~/.claude/CLAUDE.md), project root, and nested subdirectories [39], [36]. Gemini CLI combines global and project-specific GEMINI.md files similarly [40]. OpenCode reads from local AGENTS.md, then local CLAUDE.md, then ~/.config/opencode/AGENTS.md, then ~/.claude/CLAUDE.md — and can pull in remote URLs with a five-second timeout for centrally managed standards [33]. GitHub Copilot supports repository-wide instructions, path-scoped instructions with glob frontmatter, and per-agent persona files at .github/agents/CUSTOM-AGENT-NAME.md — three tiers that compose rather than collide [41].

Be precise about what “compose” means here. Across these tools, multiple instruction files load by concatenation, not by overwrite. There is no built-in winner-takes-all precedence — the agent sees all loaded layers in its context and resolves any conflict by attention, recency, and prose specificity. Treat composition as additive by default and ordering as a soft signal, not a contract. If your project CLAUDE.md says “prefer functional components” and your user-level file says “prefer class components,” the agent reads both and the outcome is unpredictable. Kiro is one of the few tools that documents an explicit override rule — workspace steering files take priority over global ones when both define the same instruction [42] — but most tools leave the conflict to the model.

The discipline is to keep each level owning a different concern so concatenation does not produce conflict in the first place. Personal preferences (your debugging style, your verbosity tolerance, your commit message format) belong at user level. Team standards (architectural patterns, library choices, test conventions) belong at the repository level so they travel with the code, get reviewed in PRs, and apply to every developer’s sessions. Agent-specific persona files belong at agent level for specialized roles — a security-review agent that prioritizes different concerns than the default. When two levels genuinely must speak to the same topic, write the more specific file to explicitly override the general one in prose (“This file overrides the user-level guidance on X: do Y instead”). Anything stronger than that is enforcement, not instruction, and belongs in hooks or policies rather than markdown. The harness chapter (Chapter 2) covers settings-file precedence, which does follow strict ordering rules; instruction files do not.

A related failure mode bites CI/CD pipelines. Custom instruction files are loaded by default; passing --no-custom-instructions to Copilot CLI is a deliberate opt-out, not a default state [41]. Pipelines that pass that flag unknowingly run with no project context. The reverse mistake is letting a developer’s user-level memory or hooks pollute pipeline runs. When you want a clean-slate agent in CI, name the layers you load explicitly — and verify what loads automatically regardless. Assign ownership for the repository-level rules file, review it on a fixed cadence — quarterly is usually enough — and delete instructions that linters, defaults, or current code structure now enforce automatically.

3.3 Activation Modes: Always, Conditional, Manual

The next decision after what a rule says is when it fires. Three modes cover the spectrum: always-on for invariants that apply everywhere, file-pattern conditional for domain-specific rules tied to a glob, and manual for guidance you invoke explicitly [43], [34]. Kiro names four inclusion modes directly in its steering-file frontmatter: always for every interaction, fileMatch for glob-keyed activation, auto for semantic matching against the file’s description, and a manual #steering-file-name mention in chat [42]. GitHub Copilot offers the same shape with different syntax: .github/copilot-instructions.md is repository-wide and always-on, while files under .github/instructions/*.instructions.md use a frontmatter applyTo field with glob patterns to fire only on matching files [41]. Amp’s AGENTS.md files at the root are always-on; nested AGENTS.md files trigger on glob patterns for their directory automatically [34]. Windsurf adds a model_decision mode where the agent itself decides if the rule body is relevant based on its description [43].

The principle transfers across tools even when the syntax differs. Pick the activation mode by asking what the rule has to constrain. Always-on is for invariants the agent should never violate regardless of where it’s working — package manager choice, security policy floors, language-wide style rules. File-pattern conditional is for domain-specific norms keyed to where the agent is editing — TypeScript strict patterns under **/*.ts, REST conventions under app/api/**/*, accessibility rules under components/**/*.tsx. Manual is for procedures the agent shouldn’t run unless you ask — a release-prep checklist, a security-audit walkthrough.

3.4 Single-pattern vs array-pattern conditional rules

A single conditional file often needs to cover several related file types. Kiro supports both forms in fileMatchPattern frontmatter: a single glob string, or an array of globs that all activate the same rule [42]. The single-pattern form is the right default; reach for the array form when one rule body genuinely covers several extensions or paths.

A single-pattern API rule:

---
inclusion: fileMatch
fileMatchPattern: "app/api/**/*.ts"
---
# API conventions
- All routes must validate input with Zod schemas
- Errors return { error: string, code: number }
- Never expose stack traces in production responses

An array-pattern TypeScript rule that bundles sources and their config files:

---
inclusion: fileMatch
fileMatchPattern: ["**/*.ts", "**/*.tsx", "**/tsconfig.*.json"]
---
# TypeScript conventions
- Strict mode required; no implicit any
- Branded types for all entity IDs
- Public exports require explicit return types

The array form avoids duplicating the same rule across three near-identical files. The same operator move transfers across surfaces even though the syntax shifts: in Copilot, you’d express the array intent as a single applyTo glob using brace expansion (applyTo: "**/*.{ts,tsx}") inside .github/instructions/typescript.instructions.md [41]; in Amp or Windsurf, you’d place a nested AGENTS.md at the directory root that owns those file types and let directory-scoped activation do the work [34]. The body is identical in all three. What changes is the binding: Kiro binds by fileMatchPattern, Copilot binds by applyTo, nested AGENTS.md binds by directory location. Once you see the boundary as the contract, the surface stops mattering.

3.5 Test pattern coverage before committing

Globs lie. The cheapest way to catch a bad pattern is to list what it actually matches before you ship it. Run the same glob against your repo with git ls-files (or rg --files | grep):

$ git ls-files 'app/api/**/*.ts' | wc -l
47
$ git ls-files 'app/api/**/*.ts' | head
app/api/auth/login/route.ts
app/api/auth/logout/route.ts
app/api/users/[id]/route.ts
...

Three diagnostic outcomes. If the count is near zero, the pattern is too narrow — typically because you wrote app/api/*.ts and missed nested folders, or pinned exact filenames. If the count is most of the repo, the pattern collapses to always-on and the rule belongs in the repo-wide file instead, where it costs you only one copy of context. If the count matches the domain you intended, the pattern is sound. Re-run the same check whenever directory layout shifts; a passing pattern from six months ago can drift silently as the codebase grows.

3.6 Handle overlapping matches deliberately

Two scoped rules will eventually fire on the same file. A repo-wide TypeScript rule (**/*.ts) and an API-specific rule (app/api/**/*.ts) both match app/api/users/route.ts. Like the level composition discussed in Section 3.2, pattern-scoped instructions concatenate by default — both bodies enter the context, and the agent picks one only if the prose tells it to.

Three operator moves keep this predictable:

Make scopes disjoint where possible. If the API rule fully supersedes the language rule for routes, narrow the language rule to **/*.ts minus app/api/** by writing it as a list of positive globs that exclude API paths, or by moving the API-specific guidance up into the language rule under a clearly labeled subsection.
Name the override explicitly in prose. When scopes must overlap, the more specific file should say so: “Within app/api/**, this file’s error-shape rule overrides any conflicting guidance from the repo-wide TypeScript rule.” Recency and specificity in the prose are the only signals the agent gets.
Promote conflicts to enforcement. If both rules genuinely matter and you cannot tolerate the agent picking one by attention, the conflict has outgrown markdown. Move the binding rule into a hook, a lint rule, or a pre-commit check.

Two scoping failures eat most of the value. Patterns written too broadly (**/* or **/*.js in a JS-everywhere repo) collapse into always-on injection, defeating the point and burning context window on irrelevant rules. Patterns written too narrowly (exact filenames instead of directory globs) miss files that need the context, creating silent coverage gaps where the agent confidently writes non-conforming code. The remedy is the loop above: pick the boundary, encode it, list matches, resolve overlaps.

Semantic-match modes — Kiro’s auto and Windsurf’s model_decision — are the most flexible options and the easiest to misuse. The model gets a one-line description of the rule and decides whether the full body is relevant [42]. When the description is sharp (“Use this rule when generating database migrations”), the model usually picks correctly. When the description is vague (“General coding guidance”), the rule is either ignored or always loaded. Treat the description like a skill description — terse, directive, anchored to a concrete trigger.

3.7 Progressive Disclosure

Progressive disclosure is the single highest-leverage move in rules-file engineering. Start lean. Add only what fails without it. Cite the rest — point to a skill, a path-scoped file, or a reference doc the agent loads on demand. A 2,000-line rules file can defeat its own purpose. In Schmid’s benchmark, auto-generated rules files reduced task success rates by roughly 3% while increasing inference cost by over 20% [38]. Carefully human-written files improved performance by only about 4% in that test and still increased cost by up to 19%. The exact percentages should not be treated as field-wide constants, but the mechanism is robust: unnecessary instructions dilute attention. In the same benchmark, extra directives increased reasoning tokens by 14–22% because the agent processed irrelevant instructions alongside relevant ones [38].

The math makes the problem concrete, but it should be read as an operating heuristic rather than a hard model limit. Several practitioners report that frontier models become less reliable once instruction lists climb into the low hundreds, and one useful working budget is roughly 150–200 instructions. Claude Code’s system prompt already consumes part of that attention budget before your rules file is even considered [37], [44]. Spend the remaining budget wisely, because every unnecessary rule competes for attention with the ones that actually matter. HumanLayer’s root CLAUDE.md — used in production daily — is under 60 lines [37], [45].

The principle that solves this is progressive disclosure: load the minimum context needed upfront, and make additional detail available on demand. Rather than cramming everything into one file, treat your rules file as a table of contents that points to deeper resources [46], [47]. One practitioner cut their CLAUDE.md from 170 lines to 40 lines using this router pattern — and immediately saw improved instruction-following, consistent with the “Lost in the Middle” finding that models deprioritize content in the center of long inputs [46]. Small repos where the full rules fit comfortably on one screen don’t need this — splitting too early fragments the mental model and hides what’s active. Reach for progressive disclosure when the rules file crosses 200–300 lines, when rules are bleeding across unrelated parts of the repo, or when the agent starts ignoring rules near the bottom of the file.

A lean root CLAUDE.md:

# Project: my-app

## Stack
TypeScript, React, PostgreSQL, Prisma ORM

## Commands
- Build: `pnpm build`
- Test: `pnpm test`
- Lint: `pnpm lint`
- Single test: `pnpm test -- --grep "pattern"`

## Key conventions
- Use pnpm, never npm
- All API routes in src/api/
- Database migrations via `pnpm prisma migrate dev`

## Deep context (read when relevant)
- Architecture: see docs/architecture.md
- API patterns: see docs/api-conventions.md
- Testing strategy: see docs/testing-guide.md

Compare that to the kind of bloated block many teams start with:

# Don't do this
- Use pnpm, and remember our API pagination rules, and prefer repository pattern,
  and auth exceptions live in middleware.ts, and analytics events must use snake_case,
  and docs screenshots belong in /assets, and mobile spacing differs on checkout,
  and the billing service has a retry exception, and...

The failure mode is visible. The high-signal rules at the top get obeyed. The repo trivia and edge cases buried halfway down compete for attention and the tail rules disappear into background noise. The lean version gives the agent what it needs for every session and tells it where to find the rest.

Nested rules files extend this pattern at the directory level. Claude Code loads subdirectory CLAUDE.md files only when the agent accesses files in that directory [39], [48]. Amp and Kiro do the same with directory-scoped AGENTS.md and steering files [34], [42]. Your src/api/CLAUDE.md can hold API-specific conventions without bloating the root file — and the agent only sees those conventions when it’s actually working on API code.

Codebase overviews in rules files — detailed descriptions of what every directory contains — provide no measurable benefit in helping agents navigate faster [38]. Agents found relevant files in roughly the same number of steps with or without an overview section. Save those tokens for instructions that actually change behavior.

3.8 Writing Effective Standing Instructions

The difference between a rules file that changes agent behavior and one that gets ignored comes down to three properties: specificity, testability, and examples.

Be specific, not aspirational. “Write clean code” is meaningless to an agent [49]. “Always use interactions.create() for API calls, never generateContent()” is an instruction the agent can follow [50]. Directives outperform implications: “Always use X” works better than “X is the recommended approach” — the first is an instruction, the second is trivia the agent won’t act on [51]. And don’t tell the agent things it already does correctly. Cursor already defaults to Server Actions over API routes in Next.js; a rule asking for that wastes context window without changing behavior [49]. The before/after from a project with no CONVENTIONS.md is concrete: without a line specifying “Prefer httpx over requests” and “Use types everywhere possible,” the agent chose requests and omitted type hints; with those two lines, it chose httpx and added type hints [52].

Make instructions testable. Each rule should have a clear pass/fail criterion. “Follow best practices for error handling” cannot be verified. “All API endpoints must validate input with Zod schemas” can be checked by running the type checker or reviewing the code. Rules that describe binary outcomes — file creation, naming patterns, presence of a specific call — are the most reliable because they have no interpretation overhead [49].

Provide examples, not just rules. Code examples in prompts outperform natural language instructions alone — agents frequently select deprecated libraries or outdated patterns without concrete examples to anchor them. If your project uses a specific pattern for database access, show it:

## Database access pattern
Always use the repository pattern. Example:

```typescript
// CORRECT
const user = await userRepository.findById(id);

// WRONG - never query Prisma directly in route handlers
const user = await prisma.user.findUnique({ where: { id } });
```

Optimize for agent consumption, not human readability. Rules files are not documentation for your team — they are inputs for an LLM. Terse, declarative, one item per line, imperative style [36]. Each learning should pass the “5-minute test”: would this save 5+ minutes next time the agent encounters this situation? If not, it doesn’t belong in the rules file — it belongs in a reference document the agent loads on demand. Anthropic’s own guidance reinforces this: specific, concrete instructions (“Use 2-space indentation”) are followed more consistently than abstract ones (“Format code properly”), and CLAUDE.md files beyond roughly 200 lines demonstrably reduce instruction adherence because they consume more context [48].

Delegate to automation what tools handle better. Style rules that ESLint, TypeScript, or Prettier can enforce should not be duplicated as written instructions. Automated backpressure — linters, type checkers, test failures — is more reliable than written instructions for correcting trivial errors [44]. Agents self-correct on build feedback without needing a rule that says “follow the style guide.” Reserve your rules file for knowledge that can’t be automated: domain-specific patterns, architectural boundaries, workflow conventions that no linter can detect.

If the rule must hold deterministically, stop stretching the rules file and move enforcement into automation. Rules files are guidance. Hooks, deny rules, and pre-commit checks are enforcement [53]. Some practitioners argue that .cursorrules and similar files are best understood as suggestions the agent may follow, not constraints it must — and that genuinely binding workflows belong in executable scripts that linter, test runner, or hook output forces the agent to confront [54]. Chapter 16 covers those hard guardrails in depth.

One advanced technique: systematic optimization through feedback loops. Iteratively refining your rules file using test results and LLM evaluations yields measurable improvements. The useful operator loop is simple: pick a fixed task set, change one rule, rerun the tasks, and keep only the rules that improve the result. That turns rules-file maintenance from guesswork into a measurable editing routine, the same way schmid-practical-guide-evaluating-202603 recommends evaluating skills with 3–5 trials per case to detect distributions rather than vibe-checking single runs [51].

3.9 Rules Files vs. Agent Memory

A common confusion is conflating rules files with agent memory. They live next to each other and both produce text the agent reads at session start, but they have different authorship, different review models, and different trust levels.

Rules files are human-authored standing orders. You write them, you review them in pull requests, they are version-controlled, and they encode decisions the team has explicitly made. Agent memory is agent-generated session residue — it accumulates as the agent observes patterns, captures preferences, or notes corrections from prior sessions. Windsurf draws this boundary explicitly: agent-generated Memories live under ~/.codeium/windsurf/memories/ and are workspace-local; human-authored Rules live under .windsurf/rules/ and are version-controlled and shared [43]. Cursor independently built the same split [55]. The two-store design is not a vendor quirk; it reflects two genuinely different needs.

Reach for instruction files when the convention is stable, team-wide, and worth reviewing. Reach for memory when you want session-accumulated context to carry forward but don’t yet trust it enough to encode as policy. The graduation path matters: candidate learnings emerge in memory, get reviewed by a human, and only then graduate into the rules file as standing orders. The Claude Diary pattern formalizes this loop — capture session learnings, periodically reflect across them, and update your rules file from the distilled output [56]. The rules file remains human-owned; the memory is the staging ground.

Failure modes look different in each store. Stale rules files calcify policy that no longer matches the codebase, and the agent confidently follows outdated patterns. Stale memories silently encode early hallucinations or one-off corrections that propagate invisibly across all future sessions. The discipline for both is the same: review, prune, and verify against current code. Cap your Learnings section at 20–30 items, review monthly, and delete entries that are no longer relevant [36]. Chapter 5 covers the broader memory architecture; this chapter owns the rules file.

3.10 Encoding Team Standards

Two developers on the same team, using the same codebase and the same AI tool, produce materially different code — not because the agent lacks knowledge, but because they give it different instructions. Rules files solve this inconsistency by making team standards executable rather than aspirational.

The consistency problem grows with team size. On a five-person team, standards can be maintained through conversation. Beyond roughly fifteen developers, conversation breaks down and you need infrastructure. A rules file committed to your repository is that infrastructure. It travels with the code, gets reviewed in pull requests, and applies to every developer’s agent sessions automatically. When a new team member clones the repo, their agent inherits the team’s standards from the first session [35].

The process of creating team standards often reveals hidden disagreements. Extracting tacit knowledge — interviewing senior engineers about what they instinctively check, reject, or flag during code review — frequently surfaces conflicts that were never articulated because different seniors reviewed different pull requests. Making these positions explicit and resolving them produces a clearer, more consistent codebase even before the agent benefits.

Effective team rules files have a four-part anatomy: role (what kind of engineering the agent should prioritize), context (project-specific facts), categorized standards (organized by domain — API design, error handling, testing, security), and output format (how the agent should structure its responses). Here’s what categorized standards look like in practice:

## Code Review Standards

### Error Handling {#sec-persistent-instructions-and-rules-error-handling}
- All API endpoints return structured errors: { error: string, code: number }
- Never expose stack traces in production responses
- Log errors with request ID for traceability

### Database {#sec-persistent-instructions-and-rules-database}
- All queries use parameterized inputs (no string interpolation)
- New tables require a migration file in db/migrations/
- Foreign keys must have ON DELETE behavior specified

### Testing {#sec-persistent-instructions-and-rules-testing}
- New API endpoints require integration tests in tests/integration/
- Test both success and error paths
- Mock external services, never call them in tests

For architectural decisions, embedding ADRs in the repository and wiring them into the agent’s instructions creates enforceable governance: when ADRs live in the repo and the rules file says “cross-reference every change against docs/adr/,” the agent flags violations and refuses instructions that contradict an ADR, guiding the developer toward updating the ADR rather than silently breaking the architectural contract [57], [58]. Use the configuration hierarchy to split team and personal concerns: project-level rules encode team standards, user-level rules (~/.claude/CLAUDE.md, ~/.config/opencode/AGENTS.md) hold individual preferences [33]. For distribution across many repos, do not copy-paste — keep one reviewed source of truth and publish it through your tooling’s install path, the way Vercel ships its Web Interface Guidelines as a single curl install that works across Claude Code, Cursor, OpenCode, Windsurf, and Gemini CLI [59].

A team rules file is governed code. Before you ship one, lock down four things:

Owner. A named maintainer (a tech lead, a platform team, or a code-owner group) responsible for accepting changes. No owner means no maintenance.
Review cadence. Quarterly is usually enough. Each review prunes rules that linters now enforce, defaults now match, or the codebase has outgrown.
Source of truth. One canonical file or repo. Other repos consume it through an install command or sync, never through copy-paste.
Rollout path. How updates reach every developer’s machine — repo PRs for project files, package install for shared standards, CI checks that fail when the local copy drifts.

Once a rules file’s growth is consistently outpacing what these four controls can absorb — when standards keep splitting into phases, handoffs, and role-specific procedures — you have outgrown the format. That is the entry point to the process-framework overlays covered in Chapter 9, where opinionated systems like BMAD package phase sequencing and role definitions as installable overlays rather than markdown rules [60].

3.11 When Rules Files Aren’t Enough

Rules files are one layer in a larger stack. When you reach past them, the question is which layer the concern actually belongs in:

Rules files — standing defaults. Human-authored, version-controlled, load on every session.
Memory — evolving facts the agent or session accumulates (Chapter 5).
Skills, commands, and process frameworks — invoked multi-step procedures with phases, arguments, or branching (Chapter 10, Chapter 9).
Hooks and policies — deterministic enforcement that must hold every time, regardless of model attention (Chapter 16, Chapter 11).

The boundary is the load pattern. Rules always load. Procedures fire on demand. Enforcement holds deterministically. Keep rules files lean, keep them reviewed, and let the other layers do their own work.

3.12 Takeaways

Keep each scope level owning a different concern: personal preferences at user level, team standards at repo level (reviewed in PRs), and agent-specific persona overrides at agent level — so multi-file concatenation does not produce silent conflicts.
Route each rule to the right activation mode based on what it must constrain: always-on for invariants the agent must never violate, file-pattern conditional for domain-specific norms keyed to where the agent is editing, and manual for procedures the agent should run only when explicitly asked.
Before committing a conditional rule’s glob pattern, run git ls-files against it and verify the match count: near-zero means the pattern is too narrow, most-of-repo means it collapses to always-on, and an intent-matching count means it is sound.
Use progressive disclosure to keep the root rules file lean: put only always-needed instructions in the root, then move domain-specific detail to path-scoped files or reference docs the agent loads on demand.
Write standing instructions as explicit commands the agent can obey, not soft recommendations: say Always use X, not X is recommended.
Move any rule that must hold deterministically out of the rules file and into a hook, lint check, or pre-commit enforcement — rules files are guidance the model may deprioritize, not constraints it cannot override.
Treat memory as a staging area: keep unreviewed session learnings there, and promote them into the version-controlled rules file only after they prove stable and the team has reviewed them.
Treat the repository rules file like governed code: assign an owner, review it on a cadence, keep one source of truth, and define how updates roll out to every developer.