MP-301c · Module 2

Shared State & Locking Strategies

3 min read

When you must share state between tool handlers, choose the narrowest locking granularity that provides correctness. A global mutex serializes all tool calls — correct but slow. A per-resource mutex (one mutex per customer ID, for example) allows concurrent operations on different resources while serializing operations on the same resource. A read-write lock allows unlimited concurrent reads but exclusive writes. Match the lock to the access pattern: mostly reads with rare writes → read-write lock. Frequent writes to different resources → per-resource mutex. Frequent writes to the same resource → global mutex is fine because there is no parallelism to preserve.

Deadlocks in MCP servers are rare but devastating. They happen when handler A holds lock X and waits for lock Y, while handler B holds lock Y and waits for lock X. Both handlers block forever, and the server appears frozen. Prevention is simpler than detection: always acquire locks in a consistent order (alphabetical by resource ID, for example), set lock acquisition timeouts (if you cannot get the lock in 5 seconds, fail the tool call), and never hold a lock across an external API call (the API might be slow or down, holding the lock for the entire duration).

import { Mutex } from "async-mutex";

// Per-resource lock map with automatic cleanup
class ResourceLockMap {
  private locks = new Map<string, { mutex: Mutex; refs: number }>();
  private timeoutMs: number;

  constructor(timeoutMs: number = 5000) {
    this.timeoutMs = timeoutMs;
  }

  async withLock<T>(resourceId: string, fn: () => Promise<T>): Promise<T> {
    // Get or create lock for this resource
    let entry = this.locks.get(resourceId);
    if (!entry) {
      entry = { mutex: new Mutex(), refs: 0 };
      this.locks.set(resourceId, entry);
    }
    entry.refs++;

    try {
      // Acquire with timeout to prevent deadlocks
      const release = await Promise.race([
        entry.mutex.acquire(),
        new Promise<never>((_, reject) =>
          setTimeout(() => reject(new Error(
            `Lock timeout on resource "${resourceId}" after ${this.timeoutMs}ms. ` +
            `Another operation may be holding the lock. Retry in a few seconds.`
          )), this.timeoutMs)
        ),
      ]);

      try {
        return await fn();
      } finally {
        release();
      }
    } finally {
      entry.refs--;
      // Cleanup unused locks to prevent memory leaks
      if (entry.refs === 0) {
        this.locks.delete(resourceId);
      }
    }
  }
}

// Usage in a handler
const locks = new ResourceLockMap(5000);

async function transferHandler(args: { from: string; to: string; amount: number }) {
  // Consistent ordering prevents deadlocks
  const [first, second] = [args.from, args.to].sort();

  return locks.withLock(first, () =>
    locks.withLock(second, async () => {
      const fromBal = await db.getBalance(args.from);
      if (fromBal < args.amount) {
        return { isError: true, content: [{ type: "text", text: "Insufficient balance" }] };
      }
      await db.setBalance(args.from, fromBal - args.amount);
      await db.setBalance(args.to, (await db.getBalance(args.to)) + args.amount);
      return { content: [{ type: "text", text: `Transferred ${args.amount}` }] };
    })
  );
}

Do This

  • Use per-resource locks when different resources can be modified independently
  • Always acquire multiple locks in a consistent order to prevent deadlocks
  • Set acquisition timeouts — a 5-second timeout prevents infinite waits
  • Clean up unused lock entries to prevent memory leaks in long-running servers

Avoid This

  • Use a global mutex when a per-resource lock would allow more concurrency
  • Hold locks across external API calls — slow APIs hold the lock for everyone
  • Acquire locks in inconsistent order — this is the primary deadlock cause
  • Skip lock timeouts — a deadlocked server is worse than a failed tool call