CDX-301h · Module 2

Escalation Paths

3 min read

Escalation is what happens when retry fails. A well-designed supervisor has a clear escalation chain: worker fails → supervisor retries with enriched context → supervisor escalates to a more capable model → supervisor escalates to a human reviewer. Each level of escalation is more expensive and slower, but handles a wider range of failures. The goal is to resolve as many failures as possible at the cheapest level and only escalate the genuinely hard cases.

Model escalation is the most common automated escalation. A worker running gpt-4.1 fails a task; the supervisor retries with codex-1; if that fails, the supervisor retries with o3 at high reasoning effort. Each step up costs more per token but brings stronger reasoning capability. The supervisor should include the previous failure reason in the escalated prompt so the more capable model can avoid the same mistake.

Human escalation is the final backstop. When automated escalation is exhausted, the pipeline pauses and notifies a human reviewer with a structured summary: what the task was, what approaches were tried, why each failed, and what the supervisor recommends. The human either provides a fix, adjusts the task scope, or aborts the pipeline. The key design principle is that human escalation should be rare (less than 5% of tasks) but well-supported when it occurs.

Do This

Define the full escalation chain upfront: retry → model escalation → human review
Include prior failure context at each escalation level — the next handler should not repeat mistakes
Set a maximum pipeline pause duration for human escalation — auto-abort if no response
Track escalation frequency per task type to identify systemic decomposition issues

Avoid This

Let the supervisor improvise escalation at runtime — it wastes tokens on meta-reasoning
Escalate directly to human without trying automated recovery — humans should handle only the hard cases
Pause the pipeline indefinitely waiting for human input — set timeouts and auto-abort policies
Treat escalation as failure — it is a designed recovery path, not a bug