CDX-201c · Module 1

Agent Configuration & Roles

3 min read

Codex supports named sub-agents configured in config.toml, each with its own model, description, and specialization. The description field is not cosmetic — it tells Codex when to delegate to that agent. A well-described sub-agent gets invoked for matching tasks; a poorly described one sits idle. Think of sub-agent descriptions like SKILL.md descriptions: they determine activation.

Model selection per agent is the primary lever for cost and quality optimization. Use fast, economical models (GPT-4.1) for high-volume, routine tasks like test writing, linting fixes, and documentation updates. Use reasoning models (o3) for architectural decisions, complex debugging, and code review. Use Codex-series models for balanced engineering tasks. This heterogeneous model strategy lets you optimize the cost-quality tradeoff per agent rather than applying one model to everything.

# Sub-agent definitions — each optimized for its role

[agents.implementer]
model = "codex-1"
description = "Implements features and fixes based on specs. \
  Use for: coding, refactoring, building features."

[agents.test-writer]
model = "gpt-4.1"
description = "Writes comprehensive test suites with edge cases. \
  Use for: unit tests, integration tests, test fixtures."

[agents.reviewer]
model = "o3"
reasoning_effort = "high"
description = "Reviews code for correctness, security, and style. \
  Use for: code review, architecture review, security audit."

[agents.documenter]
model = "gpt-4.1"
description = "Writes and updates documentation. \
  Use for: API docs, README, inline comments, changelog."

Do This

Assign the right model to each agent role — fast models for routine, reasoning models for analysis
Write specific descriptions with trigger phrases that match natural prompts
Keep agent count low — 3-4 well-defined agents beats 10 vaguely scoped ones
Test each agent independently before combining them in orchestration

Avoid This

Give every agent the most expensive model "just in case"
Write vague descriptions like "helps with development tasks"
Create overlapping agents — if two agents could handle the same task, consolidate
Skip testing individual agents — debugging a multi-agent failure is much harder than a single-agent one