OC-301h · Module 1

Severity & Escalation Framework

3 min read

Severity determines response speed, team composition, and communication cadence. Misclassifying severity — treating a SEV-1 as a SEV-3 or a SEV-3 as a SEV-1 — wastes resources and erodes trust. The severity framework must be explicit enough that the on-call engineer can classify an incident in under 2 minutes without consulting a manager.

SEV-1 (Critical): External stakeholder received incorrect output, autonomous actions were taken based on wrong data, or a safety boundary was violated. Response: immediate containment, incident commander assigned, stakeholder notification within 1 hour, status updates every 30 minutes. SEV-2 (Major): Internal operations affected, output quality degraded across multiple agents, or wrong decisions made but not yet acted upon. Response: containment within 30 minutes, root cause investigation begins immediately, status updates every 2 hours. SEV-3 (Minor): Anomaly detected, quality score below threshold, or single agent affected with no external impact. Response: investigation within 4 hours, resolution within 24 hours. SEV-4 (Watch): Potential issue detected by monitoring, not yet confirmed. Response: investigate during business hours, document findings.

interface SeverityLevel {
  level: 1 | 2 | 3 | 4;
  name: string;
  criteria: string;
  responseTime: string;
  team: string;
  updateCadence: string;
}

const severityMatrix: SeverityLevel[] = [
  {
    level: 1,
    name: 'Critical',
    criteria: 'External impact OR irreversible actions OR safety violation',
    responseTime: 'Immediate',
    team: 'Incident commander + on-call + stakeholder comms',
    updateCadence: 'Every 30 minutes',
  },
  {
    level: 2,
    name: 'Major',
    criteria: 'Internal impact OR multi-agent quality degradation',
    responseTime: '30 minutes',
    team: 'On-call + relevant module owner',
    updateCadence: 'Every 2 hours',
  },
  {
    level: 3,
    name: 'Minor',
    criteria: 'Single agent affected, no external impact',
    responseTime: '4 hours',
    team: 'On-call',
    updateCadence: 'Daily',
  },
];