SA-301f · Module 3

Composing Resilience Patterns

3 min read

Individual resilience patterns — circuit breakers, bulkheads, retries, timeouts, fallbacks — are building blocks. The production architecture composes them into a layered defense. The composition order matters: retries inside circuit breakers (not outside, which would retry against an open circuit), circuit breakers inside bulkheads (isolate the failing dependency's resource consumption), and timeouts at every layer (prevent indefinite waits from propagating).

  1. The Composition Stack From outside in: bulkhead (isolate resources) → circuit breaker (detect dependency failure) → retry with backoff (handle transient failures) → timeout (prevent indefinite waits). The bulkhead contains the blast radius. The circuit breaker detects sustained failure. The retry handles transient failure. The timeout prevents resource leaks. Each layer addresses a different failure mode.
  2. Timeout Cascade Prevention The outer timeout must be shorter than the sum of inner timeouts and retries. If the API gateway timeout is 30 seconds but the downstream service has 3 retries with 15-second timeouts, the downstream chain can take 45 seconds — exceeding the gateway timeout. The gateway returns an error while the downstream retries continue burning resources. Align timeouts across the stack: gateway timeout > (service timeout × retry count).
  3. Observability Integration Every resilience pattern must emit metrics: circuit breaker state transitions, bulkhead pool utilization, retry counts per request, timeout counts per dependency. These metrics are the operational dashboard for system resilience. A circuit breaker that opens frequently signals a dependency problem. A bulkhead that fills frequently signals a capacity problem. The patterns without observability are protections you cannot see working.