OC-301g · Module 2

Output Quality Drift

3 min read

Quality drift is the gradual degradation of output quality over time. It is invisible on any given day — today's output is 0.1% worse than yesterday's. But over three months, the cumulative drift is 9%, and stakeholders notice that the agent's deliverables are not as good as they used to be. By then, identifying the root cause requires months of archeology through logs and quality scores.

Drift detection uses trend analysis on quality scores. Plot the automated quality score for each output dimension over time. Apply a moving average to smooth daily variance. If the moving average shows a sustained decline over two or more weeks, flag it as drift. The drift alert is: "Factual accuracy score has declined from a 30-day average of 4.2 to 3.8 over the last 14 days." This early warning catches drift while the cause is recent enough to identify — a prompt change two weeks ago, a memory contamination event, or a model update that shifted the agent's behavior.

Do This

Track quality scores as time series and monitor for sustained declines — daily scores vary, trends reveal drift
Alert on two-week declining trends, not single low scores — single scores fluctuate, trends indicate systemic change
Correlate drift onset with system changes — the cause is usually a change made 1-3 weeks before drift appeared

Avoid This

Check quality only when stakeholders complain — by then drift has been compounding for weeks or months
Average quality scores over long periods — averaging hides the trend in the data
Dismiss small declines as noise — 0.1% per day is 36% per year