CM-301i · Module 1

Failure Severity Assessment

3 min read

Not all failures warrant the same recovery investment. A pilot that failed quietly in a controlled environment with five users and no external visibility is a learning event. A production rollout that produced a compliance incident, generated press coverage, and damaged customer relationships is a crisis. Applying the same recovery approach to both is resource misallocation in both directions: you over-invest in recovering the quiet pilot, and you under-invest in recovering the public crisis.

Failure severity lives on two axes. The first is organizational impact: how much did the failure damage the AI initiative's ability to produce value, and how much did it affect the people and workflows that depend on it? The second is reputational damage: how much did the failure damage the organization's credibility with internal stakeholders (employees, IT, leadership) and external stakeholders (customers, regulators, partners)?

Do This

  • Assess organizational impact on a 1-5 scale: 1 = contained to the pilot with no production impact; 5 = production workflows disrupted, customer impact, regulatory exposure
  • Assess reputational damage on a 1-5 scale: 1 = internal awareness only, contained to the project team; 5 = external press coverage, customer communication required, regulatory notification
  • Add the scores: 2-4 is a learning event requiring a standard postmortem and quiet relaunch; 5-7 is a significant failure requiring structured recovery; 8-10 is a crisis requiring crisis management resources, legal involvement, and senior leadership communication
  • Calibrate recovery investment to severity: the crisis failure requires immediate response, dedicated recovery resources, and an external communications strategy; the learning event requires a postmortem and a revised approach

Avoid This

  • Treat every failure as a crisis — this depletes recovery resources on events that would resolve with a standard postmortem and relaunch
  • Treat every failure as a learning event — the governance failure that produced a regulatory exposure requires crisis-level response regardless of organizational preference for minimizing it
  • Let the severity assessment be determined by the executive sponsor's level of discomfort — executive discomfort is correlated with reputational impact but not with organizational impact, and both axes matter