RC-401h · Module 4

The Prompt Audit: What to Measure, What to Retire

4 min read

A prompt system that is never audited accumulates technical debt the same way any software system does: slowly, then catastrophically. Prompts that were written for a use case that no longer exists continue consuming model tokens. Prompts that have not been tested since their initial deployment may have drifted from acceptable behavior without detection. Prompts with poorly defined scope boundaries have expanded over time through informal edits. The prompt audit is the scheduled, systematic review that keeps the library governable.

Audits happen on two triggers. The first is the calendar trigger: quarterly, every registered prompt is reviewed against its acceptance criteria, its regression suite is run against the current production model, and the results are recorded. The second is the event trigger: a model update, a significant change to a downstream consumer, or a production incident involving any prompt in the fleet triggers an immediate scope-specific audit of the affected prompts and their dependencies.

Measure: Schema Valid Rate and Judge Score Trend Pull the telemetry window for the audit period. For each prompt, graph the schema valid rate and judge score over time. A flat line at or above threshold means the prompt is stable. A declining trend that has not yet triggered an alert is an early warning — address it before it breaches the threshold. A prompt that has been consistently below threshold without triggering action means the alerting configuration was wrong. Fix both the prompt and the alerting.
Measure: Token Budget Utilization Compare the registered token budget for each prompt against the actual average token counts in the telemetry window. A prompt consistently using 40% of its budget may have room to tighten instructions. A prompt consistently exceeding its budget is either underbudgeted or has accumulated scope beyond its original design. Over-budget prompts are audit flags for scope review.
Retire: Identify Zombie Prompts A zombie prompt is a registered library entry with zero or near-zero invocations in the audit period. Zombie prompts are candidates for retirement. Before retiring, verify with the owning agent or team that the use case genuinely no longer exists. If the use case is dormant but not dead — a seasonal workflow, a client-specific variant — mark it as dormant with a review date. If the use case is gone, formally deprecate and archive the entry. Zombie prompts that stay registered obscure the real library and create maintenance overhead for audits.
Retire: Flag Prompts Without Regression Suites Any prompt registered in the library without a corresponding regression test file is a liability. Flag it for remediation with a 30-day deadline: either a regression suite is built and the baseline scores are established, or the prompt is deprecated. There are no ungoverned exceptions in a production-grade prompt library. "We never got around to writing tests for that one" is the sentence that precedes the incident postmortem.