AS-201b · Module 3

Security Logging and Forensics

3 min read

If you cannot see what happened, you cannot determine whether something went wrong. Security logging for AI systems captures the full lifecycle of every interaction: what went in, what the model produced, what guardrails flagged, what reached the user. Without these logs, a breach is invisible until the damage surfaces externally — and by then, you have lost the ability to understand what was compromised, when, and how.

Log Inputs Record every user message, every document retrieved from context, every tool input. Store them with timestamps and user identifiers. These logs are the forensic trail that reconstructs what data the model was exposed to and when.
Log Outputs Record the model's complete response before any post-processing. If a guardrail modifies or blocks the output, log both the original and the modified version. The gap between "what the model wanted to say" and "what reached the user" is where the security events live.
Log Guardrail Events Every time an input filter triggers, an output guardrail flags something, or a classifier detects anomalous behavior — log it. These events are the early warning system. A single guardrail trigger is a data point. A cluster of guardrail triggers from the same user or IP is an investigation.
Set Retention and Access Policies Logs that contain user inputs may contain sensitive data. Define retention periods, encrypt at rest, restrict access to security personnel, and ensure compliance with your data protection obligations. The logs that protect you from external threats must not become an internal threat themselves.

Forensic readiness means that when — not if — an incident occurs, you can reconstruct the full attack chain within hours, not weeks. Which input triggered the anomaly? What data was in the context window? What did the model produce? What reached the user? What tool actions were executed? Each of these questions should be answerable from your logs within minutes of starting an investigation.