PE-301a · Module 2

Score Calibration

3 min read

A propensity score of 0.72 should mean that 72% of deals with this score historically closed. That is calibration — the alignment between predicted probability and actual outcome frequency. An uncalibrated model might produce scores that rank deals correctly (higher scores are more likely to close) but whose absolute values are meaningless (a 0.72 score might correspond to only a 45% actual close rate). Calibration makes the scores interpretable as real probabilities.

Calibration Table — Predicted vs Actual Close Rates

Score Bucket    Predicted    Actual    Deals    Calibration
──────────────  ─────────    ──────    ─────    ───────────
0.00 - 0.10     5%           4%       82       ✓ Well calibrated
0.10 - 0.20     15%          13%      71       ✓ Well calibrated
0.20 - 0.30     25%          22%      54       ✓ Slight under
0.30 - 0.50     40%          35%      63       ⚠ Under-predicted
0.50 - 0.70     60%          58%      48       ✓ Well calibrated
0.70 - 0.90     80%          72%      35       ⚠ Over-predicted
0.90 - 1.00     95%          88%      19       ⚠ Over-predicted

Action: Apply Platt scaling to adjust high-score buckets downward.

Calibration is checked by bucketing deals into score ranges and comparing predicted versus actual close rates. If the predicted and actual rates align within 5 percentage points, the model is well calibrated. If they diverge, apply calibration techniques — Platt scaling (fitting a logistic function to the output scores) or isotonic regression (non-parametric calibration) — to adjust the scores.