SA-301d · Module 3

API Observability

3 min read

An API without observability is a black box with a contract attached. You know what the API promises but not how it performs. API observability — structured logging, distributed tracing, and real-time metrics — provides the visibility that enables proactive operations instead of reactive firefighting. The observability design should be part of the API architecture, not bolted on after the first production incident.

Request Metrics Track four metrics per endpoint: request rate (throughput), error rate (reliability), latency distribution (performance), and payload size (efficiency). These are the RED metrics plus payload size. Dashboard them per-endpoint and per-consumer. Alert on deviations from baselines, not fixed thresholds — an endpoint that normally serves 100 requests per second and drops to 10 is a signal even if 10 is within the absolute threshold.
Distributed Tracing Every API request should carry a trace ID that propagates through every downstream service call. When the API response is slow, the trace shows exactly which downstream service introduced the latency. Without tracing, latency debugging requires correlation across multiple log systems — a process that takes hours. With tracing, the same investigation takes minutes.
Consumer Analytics Track which consumers call which endpoints, how often, and which API versions they use. Consumer analytics inform deprecation decisions (who is still on v1?), capacity planning (which consumer drives 60% of traffic?), and partnership conversations (which consumer's usage pattern suggests they need a premium tier?). The API is a product. Consumer analytics are the product metrics.