AT-301c · Module 3

Quality Regression Detection

3 min read

Quality does not only improve. It regresses — silently, gradually, and usually because something upstream changed without the downstream quality gate being recalibrated. A prompt update that improves speed might degrade quality. A new agent joining the team changes handoff patterns. A role expansion introduces tasks the quality gate was never designed to evaluate.

Regression detection requires a baseline and a monitoring cadence. Establish the baseline: run 10 representative artifacts through the full quality loop and record the scores. That is your benchmark. Weekly, run 3 new artifacts through the same loop and compare. A sustained drop of 0.30 points or more across two consecutive weeks is a regression signal. Do not wait for customer complaints to discover quality regression — by then, the damage is measured in lost trust, not lost points.

Establish Baseline Scores Run 10 representative artifacts through the full quality loop. Record scores per dimension. Calculate the mean and standard deviation. This is your quality baseline — the number you are defending.
Monitor Weekly Run 3 new artifacts through the same loop every week. Compare dimension scores to baseline. Flag any dimension that drops more than one standard deviation below the mean.
Investigate and Remediate When a regression signal fires: identify what changed upstream (prompt updates, role changes, new agents, modified handoffs). Isolate the cause. Fix at the source. Re-run the baseline to confirm recovery. Document the incident for future pattern matching.