SD-301d · Module 1

Data Hygiene for Scoring

3 min read

The model is a function of the data. If the data is wrong, the model is wrong. There is no sophistication of algorithm that compensates for a rep who advances a deal to "proposal sent" because they plan to send a proposal next week. The deal is not in "proposal sent." The deal is in "proposal planned." That distinction changes the score by twelve to eighteen points. Multiply that error across forty deals and your forecast is fiction. Data hygiene is not a cleanup task. It is the foundation upon which every scoring model stands or falls.

Do This

  • Define stage criteria so precisely that two different people would classify the same deal identically
  • Automate stage advancement where possible — when the proposal email is sent, the deal moves, not before
  • Audit stage accuracy monthly by sampling twenty deals and verifying their stage against the criteria

Avoid This

  • Let reps self-report stage position without verification or exit criteria
  • Allow "it is roughly in that stage" — roughly is the enemy of accuracy
  • Build a sophisticated scoring model on top of data that has never been audited