CDX-301i · Module 2
Tracing Across Agents
3 min read
When a multi-agent pipeline fails, the first question is always "which agent caused the failure and why?" Without distributed tracing, answering this question requires manually correlating logs across agents — a process that scales poorly as pipeline complexity grows. Distributed tracing assigns a unique trace ID to each pipeline execution and propagates it through every agent invocation, hand-off, and quality gate. The resulting trace is a complete timeline of the pipeline, viewable in a single dashboard.
The OpenAI Agents SDK provides built-in tracing through its trace and span system. Each agent run creates a span; hand-offs create child spans linked to the parent. Custom spans can be added for quality gates, file operations, and external API calls. The trace output is compatible with OpenTelemetry, meaning it can be exported to standard observability platforms (Datadog, Jaeger, Grafana Tempo) for visualization and alerting.
from agents import Agent, Runner, trace, custom_span
import functools
# Automatic tracing — the SDK traces every agent run
result = Runner.run(
supervisor,
"Implement the rate limiting feature",
# trace_id is auto-generated; or provide your own:
run_config={"trace_id": "pipeline-2026-04-21-001"},
)
# Custom spans for non-agent operations
@custom_span("quality-gate")
def run_quality_gate(workdir: str) -> dict:
"""Custom span wraps the quality gate for tracing."""
import subprocess
result = subprocess.run(
["npm", "test"], capture_output=True,
text=True, cwd=workdir
)
return {
"passed": result.returncode == 0,
"output": result.stdout[-500:], # Last 500 chars
}
# Trace visualization
# Pipeline trace:
# ├── supervisor (o3, 12.4s)
# │ ├── implementer (codex-1, 45.2s)
# │ │ └── quality-gate (2.1s) ✓
# │ ├── tester (gpt-4.1, 23.8s)
# │ │ └── quality-gate (1.8s) ✓
# │ └── merge (3.2s) ✓
# └── total: 88.5s, $0.47
- Enable SDK tracing The Agents SDK traces automatically. Ensure trace output is directed to your observability platform via the OpenTelemetry exporter.
- Add custom spans Wrap quality gates, file operations, and external calls in custom spans. These non-agent operations are invisible without custom instrumentation.
- Build trace-based alerts Alert on traces where any span exceeds its expected duration by 2x, or where total pipeline cost exceeds the budget threshold.