SA-301e · Module 2

Kappa vs. Lambda Architecture

3 min read

Lambda architecture runs batch and streaming in parallel. The batch layer processes the full dataset for accuracy. The speed layer processes real-time events for freshness. The serving layer merges both views. This provides both accuracy and freshness — at the cost of maintaining two processing pipelines with identical business logic in different frameworks. Kappa architecture simplifies by using a single stream processing layer for everything, replaying the stream for reprocessing instead of maintaining a separate batch layer.

  1. Lambda: When Accuracy and Freshness Are Both Critical Financial reporting needs exact numbers (batch) and near-real-time dashboards (streaming). The batch layer corrects the streaming approximations overnight. The trade-off: maintaining two pipelines doubles the development and operational effort. The benefit: the batch layer provides a correctness guarantee that streaming alone cannot.
  2. Kappa: When One Pipeline Is Sufficient If the stream processor can handle the full data volume during reprocessing, a single pipeline serves both real-time and historical queries. When business logic changes, replay the stream from the beginning with the new logic. The trade-off: reprocessing the full stream requires retaining all events and sufficient compute for replay. The benefit: one pipeline, one codebase, one operational model.
  3. The Practical Choice Start with Kappa. Move to Lambda only when evidence shows that the streaming layer cannot provide the accuracy the business requires — typically in financial reconciliation, regulatory reporting, or ML training pipelines where approximate results are unacceptable. Most systems do not need Lambda. The ones that do know it early because the accuracy requirements are explicit in the business requirements.