MP-301g · Module 2

Stateful vs Stateless Sessions

4 min read

The decision between stateful and stateless MCP sessions is an architectural choice with cascading consequences. Stateless sessions carry all context in each request — the client resends capabilities, tool state, and conversation context with every call. The server is a pure function: same input, same output, no memory between requests. This makes horizontal scaling trivial (any instance handles any request) and failure recovery simple (just retry against any instance). The cost is larger request payloads and repeated capability negotiation.

Stateful sessions store context server-side, keyed by Mcp-Session-Id. The server remembers the negotiated capabilities, accumulated tool state, and any server-side resources the client has subscribed to. Requests are smaller because context is implicit. But statefulness introduces the affinity problem: requests for a given session must reach the instance holding that session's state. This means sticky load balancing, session replication, or externalized state — each with its own failure modes.

Hybrid approaches are the pragmatic choice for most production systems. Keep the session stateless for the hot path (tool invocations, resource reads) but stateful for the slow path (capability negotiation, subscription management). Externalize the stateful parts to Redis or a shared database so that any instance can serve any session. The session ID becomes a key into the external store, not a pointer to local memory. This gives you the scaling benefits of statelessness with the ergonomic benefits of state — at the cost of an external dependency.

Do This

  • Default to stateless sessions unless you have a concrete reason for server-side state
  • Externalize stateful session data to Redis or a database — never keep it only in memory
  • Design sessions to be rebuildable: if the state is lost, the client can re-negotiate
  • Monitor session count and memory usage per instance to catch session leaks

Avoid This

  • Store session state in a global in-memory map and call it production-ready
  • Assume sticky load balancing is sufficient — the sticky instance will eventually fail
  • Ignore session cleanup — leaked sessions consume memory until the process OOMs
  • Mix stateful and stateless semantics without clear boundaries — pick one per subsystem