MP-301b · Module 1

Testing Tool Handlers

4 min read

The fastest path to testable MCP tools is separating your business logic from the MCP protocol layer. Your tool handler function should be a pure function: it takes typed arguments, calls dependencies (database, API, filesystem), and returns a result object. The MCP SDK wiring — setRequestHandler, schema routing, transport connection — is infrastructure that wraps your handler. Extract the handler into its own module, test it directly with your standard test framework, and let the SDK handle protocol concerns.

Handler tests should cover four categories: happy path (valid input produces correct output), validation (invalid input produces helpful error with isError: true), edge cases (empty strings, maximum values, Unicode, special characters), and error propagation (downstream failures produce clean error messages, not stack traces). The edge cases category is where most MCP tools fail in production — the LLM sends inputs that a human never would, like a 10,000-character search query or a customer ID with emoji in it.

Mock your dependencies at the boundary. If your tool calls a database, inject the database client so tests can substitute a mock. If it calls an external API, inject the HTTP client. The test should verify your handler's logic — not that PostgreSQL works or that the weather API is up. Use dependency injection via constructor arguments or a context object passed to every handler. This makes your handlers testable, your tests fast, and your server configurable across environments.

import { describe, it, expect, vi } from "vitest";
import { getCustomerHandler } from "../../src/handlers/customer.js";
import type { CrmClient } from "../../src/types.js";

describe("getCustomerHandler", () => {
  const mockCrm: CrmClient = {
    findCustomer: vi.fn(),
    listCustomers: vi.fn(),
  };

  it("returns customer data for valid ID", async () => {
    vi.mocked(mockCrm.findCustomer).mockResolvedValue({
      id: "CUS-001", name: "Acme Corp", status: "active",
    });

    const result = await getCustomerHandler(
      { customer_id: "CUS-001" },
      { crm: mockCrm },
    );

    expect(result.isError).toBeFalsy();
    const data = JSON.parse(result.content[0].text);
    expect(data.name).toBe("Acme Corp");
  });

  it("returns actionable error for invalid ID format", async () => {
    const result = await getCustomerHandler(
      { customer_id: "invalid!" },
      { crm: mockCrm },
    );

    expect(result.isError).toBe(true);
    expect(result.content[0].text).toContain("CUS-");
    expect(result.content[0].text).toContain("list_customers");
  });

  it("handles not-found gracefully", async () => {
    vi.mocked(mockCrm.findCustomer).mockResolvedValue(null);

    const result = await getCustomerHandler(
      { customer_id: "CUS-999" },
      { crm: mockCrm },
    );

    expect(result.isError).toBe(true);
    expect(result.content[0].text).toContain("not found");
  });

  it("handles database errors without leaking internals", async () => {
    vi.mocked(mockCrm.findCustomer).mockRejectedValue(
      new Error("ECONNREFUSED 127.0.0.1:5432"),
    );

    const result = await getCustomerHandler(
      { customer_id: "CUS-001" },
      { crm: mockCrm },
    );

    expect(result.isError).toBe(true);
    expect(result.content[0].text).not.toContain("ECONNREFUSED");
    expect(result.content[0].text).toContain("temporarily unavailable");
  });
});

Extract handlers into pure functions Move each tool's logic into a standalone function: `(args, deps) => Promise<ToolResult>`. The deps object holds injected dependencies (db, api clients).
Test all four categories Write tests for happy path, validation errors, edge cases (empty strings, huge inputs, Unicode), and downstream failures. Edge cases are where production breaks happen.
Assert on error message content Error messages are prompts. Assert that they contain recovery hints (alternative tools, valid formats, retry guidance). This is testing your tool's UX.