CDX-301g · Module 2
Context Budgets & Window Management
3 min read
Every agent has a finite context window, and effective partitioning requires budgeting that window. A typical Codex agent has 200k tokens of context. Subtract the system prompt (~2k), AGENTS.md (~1-3k), conversation history (grows with each turn), and tool call overhead (~500 per call). The remaining budget — your effective working context — is what the agent uses to read files, reason about code, and produce output. Overcommitting the context budget leads to truncation, summarization, and degraded output quality.
Context budgeting has a practical implication for task decomposition: tasks that require reading large codebases must be split into smaller units, not because the work is complex but because the reading exceeds the context budget. An agent that needs to read 50 files to understand a refactoring target will exhaust its context before it starts writing code. The fix is to decompose by context budget: each agent reads only the files it modifies, plus the minimal shared context it needs for type checking.
# Context budget allocation (200k window)
Fixed costs (unavoidable):
System prompt: ~2,000 tokens
AGENTS.md (root): ~2,000 tokens
Dir-scoped AGENTS.md: ~1,000 tokens
Tool overhead: ~3,000 tokens
─────────────────────────────────────
Subtotal: ~8,000 tokens
Variable costs (per task):
Source files to read: ~500-2,000 tokens/file
Type definitions: ~200-500 tokens/file
Test fixtures: ~300-800 tokens/file
Hand-off context: ~500-1,000 tokens
─────────────────────────────────────
Budget for reasoning: remaining tokens
Rule of thumb:
If reading inputs exceeds 40% of the context window,
the task is too large — decompose further.
Do This
- Budget context explicitly — count tokens consumed by inputs before launching the agent
- Apply the 40% rule: reading inputs should consume less than 40% of the context window
- Decompose by context budget when tasks require reading large codebases
- Use directory-scoped AGENTS.md to avoid loading irrelevant project context
Avoid This
- Assume the context window is unlimited — truncation happens silently and degrades output
- Load every potentially relevant file "just in case" — relevance must be deliberate
- Ignore conversation history growth — multi-turn agents consume context with each exchange
- Forget tool call overhead — each tool invocation consumes context for the call and the result