CDX-201c · Module 1

Agent Configuration & Roles

3 min read

Codex supports named sub-agents configured in config.toml, each with its own model, description, and specialization. The description field is not cosmetic — it tells Codex when to delegate to that agent. A well-described sub-agent gets invoked for matching tasks; a poorly described one sits idle. Think of sub-agent descriptions like SKILL.md descriptions: they determine activation.

Model selection per agent is the primary lever for cost and quality optimization. Use fast, economical models (GPT-4.1) for high-volume, routine tasks like test writing, linting fixes, and documentation updates. Use reasoning models (o3) for architectural decisions, complex debugging, and code review. Use Codex-series models for balanced engineering tasks. This heterogeneous model strategy lets you optimize the cost-quality tradeoff per agent rather than applying one model to everything.

# Sub-agent definitions — each optimized for its role

[agents.implementer]
model = "codex-1"
description = "Implements features and fixes based on specs. \
  Use for: coding, refactoring, building features."

[agents.test-writer]
model = "gpt-4.1"
description = "Writes comprehensive test suites with edge cases. \
  Use for: unit tests, integration tests, test fixtures."

[agents.reviewer]
model = "o3"
reasoning_effort = "high"
description = "Reviews code for correctness, security, and style. \
  Use for: code review, architecture review, security audit."

[agents.documenter]
model = "gpt-4.1"
description = "Writes and updates documentation. \
  Use for: API docs, README, inline comments, changelog."

Do This

  • Assign the right model to each agent role — fast models for routine, reasoning models for analysis
  • Write specific descriptions with trigger phrases that match natural prompts
  • Keep agent count low — 3-4 well-defined agents beats 10 vaguely scoped ones
  • Test each agent independently before combining them in orchestration

Avoid This

  • Give every agent the most expensive model "just in case"
  • Write vague descriptions like "helps with development tasks"
  • Create overlapping agents — if two agents could handle the same task, consolidate
  • Skip testing individual agents — debugging a multi-agent failure is much harder than a single-agent one