AT-301c · Module 3
Quality Regression Detection
3 min read
Quality does not only improve. It regresses — silently, gradually, and usually because something upstream changed without the downstream quality gate being recalibrated. A prompt update that improves speed might degrade quality. A new agent joining the team changes handoff patterns. A role expansion introduces tasks the quality gate was never designed to evaluate.
Regression detection requires a baseline and a monitoring cadence. Establish the baseline: run 10 representative artifacts through the full quality loop and record the scores. That is your benchmark. Weekly, run 3 new artifacts through the same loop and compare. A sustained drop of 0.30 points or more across two consecutive weeks is a regression signal. Do not wait for customer complaints to discover quality regression — by then, the damage is measured in lost trust, not lost points.
- Establish Baseline Scores Run 10 representative artifacts through the full quality loop. Record scores per dimension. Calculate the mean and standard deviation. This is your quality baseline — the number you are defending.
- Monitor Weekly Run 3 new artifacts through the same loop every week. Compare dimension scores to baseline. Flag any dimension that drops more than one standard deviation below the mean.
- Investigate and Remediate When a regression signal fires: identify what changed upstream (prompt updates, role changes, new agents, modified handoffs). Isolate the cause. Fix at the source. Re-run the baseline to confirm recovery. Document the incident for future pattern matching.