DR-301b · Module 3

Confidence Calibration in Prompts

4 min read

Confidence calibration is the discipline of making your prompt's uncertainty expression match the actual reliability of its output. Most research prompts produce output that reads with uniform confidence — every finding sounds equally certain, whether backed by SEC filings or inferred from a single blog post. Calibrated prompts force the model to distinguish between levels of certainty and communicate that distinction to the consumer.

  1. Define the Confidence Scale Establish a three-tier or five-tier scale in the prompt itself. Example: HIGH = multiple independent sources confirm, data is recent and primary. MEDIUM = two sources or one primary source, data is within 12 months. LOW = single secondary source, data may be stale, inference involved. The definitions must be in the prompt — do not assume the model shares your confidence framework.
  2. Require Source Attribution Per Finding Every finding in the output must cite its evidence and assign a confidence tier. "Revenue estimated at $45M (HIGH — 10-K filing, Q4 2025)" versus "Market share estimated at 12% (LOW — single analyst report from 2024, methodology unclear)." The consumer sees both the finding and the reason to trust or question it.
  3. Calibrate Against Outcomes Track your HIGH / MEDIUM / LOW assessments against eventual outcomes. If your HIGH-confidence findings are correct 70% of the time instead of 90%, your prompts are over-confident — tighten the HIGH criteria. Calibration is empirical, not theoretical.