CC-301a · Module 3

Compound Engineering Measurement

4 min read

Compound engineering is the principle that every improvement to your CLAUDE.md makes every future session better. A rule added in week one prevents errors in weeks two through fifty-two. A code snippet added in week three improves output quality in weeks four through fifty-two. The compound effect is real, but it is also invisible — you cannot directly observe the errors that did not happen because a rule prevented them.

Measuring compound engineering requires proxy metrics. The most accessible metric is first-attempt acceptance rate: what percentage of Claude's code output do you accept without modification on the first try? Track this weekly. A well-maintained CLAUDE.md should push this number up over time — from 40% in week one to 70% by month three. If the number plateaus or declines, your rules are stale, conflicting, or bloated.

The second metric is intervention frequency: how often do you need to interrupt Claude, correct its approach, or regenerate output? This is harder to track formally, but even a rough weekly estimate is valuable. The trendline matters more than the absolute number. If you are intervening less often in month three than in month one, your CLAUDE.md is compounding.

The third metric — and the one that ultimately matters to the business — is time-to-merge for AI-assisted PRs. Measure the elapsed time from the first Claude Code prompt to the PR being merged. This captures everything: rule effectiveness, code quality, test coverage, review friction. A well-maintained CLAUDE.md reduces time-to-merge because Claude produces code that passes tests, follows conventions, and requires fewer review cycles. Track the monthly average and watch for the downward trend.

1. First-Attempt Acceptance Rate Track weekly: what percentage of Claude's outputs do you accept without modification? This is the most direct measure of rule effectiveness. A healthy CLAUDE.md pushes this from ~40% in week one to ~70% by month three.
2. Intervention Frequency Track weekly: how often do you interrupt, correct, or regenerate? The absolute number is less important than the trend. Declining interventions mean your rules are preventing mistakes before they happen.
3. Time-to-Merge Track monthly: elapsed time from first prompt to PR merged. This captures the full compound effect — better rules mean better code, fewer review cycles, and faster merges. The ultimate business metric for CLAUDE.md ROI.