AS-301g · Module 1
Log Pipeline Engineering
3 min read
The log pipeline is the infrastructure between the agent and the SIEM — the collection, normalization, enrichment, and shipping layers that transform raw events into SIEM-ingestible data. A poorly designed pipeline drops events under load, introduces latency that delays detection, and adds cost that makes retention impractical. A well-designed pipeline handles peak load without loss, delivers events within seconds, and compresses data for efficient storage.
- Collection Layer Agents emit events to a local collector (Fluentd, Vector, or the SIEM's native agent) that buffers events locally before forwarding. Local buffering prevents event loss during network disruptions. The collector normalizes events to the standard schema before shipping.
- Enrichment Layer Before reaching the SIEM, events are enriched with context: agent role, data classification of the context window contents, risk score of the operation, and geolocation of the requesting user. Enrichment at pipeline time avoids lookup latency during SIEM query execution.
- Shipping and Retention Ship enriched events to the SIEM for real-time detection and to cold storage for long-term forensics. The SIEM retains 90 days for active querying. Cold storage retains 1-7 years for compliance and forensic needs. Dual-destination shipping ensures detection speed and audit retention are both satisfied.