AS-201c · Module 2
AI Incident Classification
3 min read
Good news, everyone! Not every alert is an incident. Not every incident is a crisis. The difference between a monitoring system that improves security and a monitoring system that drowns your team in noise is classification — the structured process of determining what happened, how severe it is, and what response it demands.
- Severity 1: Critical — Active Data Breach Confirmed exfiltration of sensitive data through the AI system. Customer PII in model outputs. Credentials exposed through conversation logs. Unauthorized tool execution that accessed restricted systems. Response: immediate containment, incident commander activation, stakeholder notification within one hour.
- Severity 2: High — Confirmed Exploit Successful prompt injection that changed model behavior. System prompt extracted by a user. Model consistently bypassing safety constraints. Unauthorized access to AI admin functions. Response: containment within one hour, root cause analysis within four hours, stakeholder notification within 24 hours.
- Severity 3: Medium — Attempted Exploit Detected injection attempts that were blocked by defenses. Anomalous query patterns suggesting reconnaissance. Guardrail triggers that prevented data leakage. Response: log analysis within 24 hours, defense validation, pattern added to detection rules.
- Severity 4: Low — Suspicious Activity Statistical anomalies without confirmed malicious intent. Unusual usage patterns from known users. Single guardrail triggers without follow-up activity. Response: logged for trend analysis, reviewed in weekly security review, no immediate action required.
Do This
- Classify every alert before deciding on a response — severity determines the urgency and scope of action
- Document classification criteria so different team members reach the same severity level for the same event
- Review classifications retroactively — was the severity accurate? Would you classify it differently now?
Avoid This
- Treat every alert as a critical incident — alert fatigue leads to real incidents being ignored
- Leave classification to individual judgment without criteria — inconsistency creates security gaps
- Skip classification and go straight to response — mismatched response severity wastes resources or misses threats