PM-301h · Module 2

Regression Testing

4 min read

A regression test is a unit test run that specifically checks whether a change to the prompt broke something that was previously passing. It is the test suite you run every time the prompt is modified, before the change is promoted to production. Without regression testing, prompt improvements are guesses: you believe you made the prompt better, but you cannot verify that the improvement did not break something else.

The regression test suite is the full golden dataset. Every time a prompt version is bumped, the complete suite runs. Any case that was passing in the previous version and is now failing is a regression. A regression does not automatically block promotion — a failing edge case that was previously passing may be an acceptable tradeoff for an improvement in standard cases — but it must be explicitly reviewed and accepted before promotion. "The regression was reviewed and accepted" is a required field in the promotion workflow. "The regression was not noticed" is a production incident waiting to happen.

# Example: prompt testing in CI/CD
name: Prompt Regression Test

on:
  push:
    paths:
      - 'prompts/**'
      - 'prompt-metadata/**'

jobs:
  regression:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Identify changed prompts
        id: changed
        run: |
          git diff --name-only HEAD~1 HEAD -- prompts/ > changed_prompts.txt
          cat changed_prompts.txt

      - name: Run regression tests for changed prompts
        run: |
          python scripts/run_regression.py             --changed-prompts changed_prompts.txt             --dataset-dir datasets/             --baseline-results results/baseline/             --output results/current/             --fail-on-regression true             --regression-review-required true

      - name: Compare against baseline
        run: |
          python scripts/compare_results.py             --baseline results/baseline/             --current results/current/             --output regression-report.json

      - name: Post regression report
        if: always()
        uses: actions/github-script@v7
        with:
          script: |
            const report = require('./regression-report.json');
            const body = formatRegressionReport(report);
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: body
            });