PE-301a · Module 3
Detecting and Handling Drift
3 min read
Model drift occurs when the relationship between features and outcomes changes. There are two types: data drift (the distribution of features changes — deal sizes shift, new industries enter the pipeline) and concept drift (the same features now predict different outcomes — decision-maker engagement used to predict close, but a new competitor means it now also predicts competitive loss). Both types degrade model accuracy, but they require different responses.
- Data Drift Detection Compare the distribution of each feature in the scoring population against the training population. If the average deal size in the training set was $75K and the current pipeline average is $120K, the model is being applied to data it was not trained on. Feature distribution monitoring flags this drift.
- Concept Drift Detection Monitor the correlation between individual features and outcomes over time. If "3+ meetings in 14 days" used to correlate with 70% close rate and now correlates with 50%, the concept has drifted — the feature still exists but means something different. Concept drift requires retraining, not just recalibration.
- Retraining Triggers Retrain when: calibration error exceeds 10 percentage points in any score bucket, AUC drops below 0.65, or feature drift exceeds two standard deviations for more than two features. These triggers ensure retraining happens when needed — not on a fixed schedule that might be too frequent or too infrequent.