AS-201c · Module 3

AI Forensic Analysis

3 min read

Forensic analysis for AI systems follows the same principles as traditional digital forensics — preserve evidence, reconstruct the timeline, determine the impact — but the evidence is different. In a traditional breach, you examine network logs, file system changes, and authentication records. In an AI breach, you examine conversation logs, context window contents, model outputs, tool invocation records, and guardrail trigger events. The forensic toolkit is different, but the methodology is the same: follow the data.

  1. Preserve the Evidence Before any analysis, snapshot the system state: conversation logs, context window contents at the time of the incident, model configuration, system prompt version, tool access permissions, and guardrail configuration. Immutable copies. Timestamped. Chain of custody documented. If this evidence is modified during investigation, the forensic value is destroyed.
  2. Reconstruct the Attack Timeline Starting from the alert that triggered the incident, work backward through the logs. When did the suspicious activity begin? What was the first anomalous input? What was the model's response? Did the attacker probe the system before exploiting it? Map every relevant interaction on a timeline.
  3. Determine the Blast Radius What data was in the context window during the compromised sessions? What tools were invoked? What systems were accessed? What outputs were sent to the attacker? The blast radius is not what could have been accessed — it is what was actually accessed. Logs tell you the difference.
  4. Identify the Root Cause Was it an input that bypassed sanitization? A system prompt that was not hardened against a specific attack? A tool permission that was too broad? An output guardrail that missed a pattern? The root cause is the specific defense gap that the attacker exploited. Fix this gap and you fix this class of attack.