CDX-201c · Module 1
Agent Configuration & Roles
3 min read
Codex supports named sub-agents configured in config.toml, each with its own model, description, and specialization. The description field is not cosmetic — it tells Codex when to delegate to that agent. A well-described sub-agent gets invoked for matching tasks; a poorly described one sits idle. Think of sub-agent descriptions like SKILL.md descriptions: they determine activation.
Model selection per agent is the primary lever for cost and quality optimization. Use fast, economical models (GPT-4.1) for high-volume, routine tasks like test writing, linting fixes, and documentation updates. Use reasoning models (o3) for architectural decisions, complex debugging, and code review. Use Codex-series models for balanced engineering tasks. This heterogeneous model strategy lets you optimize the cost-quality tradeoff per agent rather than applying one model to everything.
# Sub-agent definitions — each optimized for its role
[agents.implementer]
model = "codex-1"
description = "Implements features and fixes based on specs. \
Use for: coding, refactoring, building features."
[agents.test-writer]
model = "gpt-4.1"
description = "Writes comprehensive test suites with edge cases. \
Use for: unit tests, integration tests, test fixtures."
[agents.reviewer]
model = "o3"
reasoning_effort = "high"
description = "Reviews code for correctness, security, and style. \
Use for: code review, architecture review, security audit."
[agents.documenter]
model = "gpt-4.1"
description = "Writes and updates documentation. \
Use for: API docs, README, inline comments, changelog."
Do This
- Assign the right model to each agent role — fast models for routine, reasoning models for analysis
- Write specific descriptions with trigger phrases that match natural prompts
- Keep agent count low — 3-4 well-defined agents beats 10 vaguely scoped ones
- Test each agent independently before combining them in orchestration
Avoid This
- Give every agent the most expensive model "just in case"
- Write vague descriptions like "helps with development tasks"
- Create overlapping agents — if two agents could handle the same task, consolidate
- Skip testing individual agents — debugging a multi-agent failure is much harder than a single-agent one