AT-301h · Module 1

Root Cause vs. Symptom

3 min read

In multi-agent systems, the symptom and the root cause are frequently separated by 2-3 agents in the chain. Fixing the symptom — patching the agent that produced bad output — is a band-aid that does not prevent recurrence. The failure will resurface, often in a different form, because the upstream issue persists.

The Five Whys adapted for multi-agent systems: Why did RENDER produce a misaligned layout? Because the content brief from QUILL had incorrect section counts. Why did QUILL produce an incorrect brief? Because SCOPE's intelligence digest used an outdated competitive dataset. Why was SCOPE's dataset outdated? Because the data refresh pipeline silently failed 3 days ago. Why did the pipeline fail silently? Because the failure notification was routed to a decommissioned channel. Root cause: notification routing configuration was not updated during the last team restructure.

The root cause is 4 hops away from the symptom. Fixing RENDER's layout engine would produce zero lasting improvement. Fixing the notification routing resolves this failure and an entire category of future silent failures.