RC-401b · Module 1

Testing & Validation Pipeline

4 min read

An agent without a testing pipeline is a liability wearing a productivity costume. You might not notice the defects for days — the agent ships code that passes linting but fails at runtime, generates emails that sound correct but contain hallucinated data, or makes API calls that succeed individually but violate business logic when composed.

The validation pipeline has three layers, and each layer catches a different class of failure. Layer one: static verification. TypeScript compilation (npx tsc --noEmit), linting, format checks. These catch structural errors — wrong types, missing imports, syntax violations. Layer two: behavioral testing. Unit tests and integration tests (Vitest, @testing-library) that verify the agent's output matches expected behavior. Layer three: contract testing. OpenClaw testing patterns that verify the agent's actions conform to defined protocols — correct API endpoints, valid payload schemas, expected side effects and nothing more.

Layer 1: Static Verification Add validation commands to your CLAUDE.md so the agent runs them after every significant change. At minimum: npx tsc --noEmit for type checking, your linter for style enforcement, and a build-dry-run if your toolchain supports it. These run in seconds and catch 60-70% of defects before any test suite executes. Configure CC to treat these as blocking — no commit proceeds until static verification passes.
Layer 2: Behavioral Testing Write tests that describe what the agent should produce, not how it produces it. For a code-generating agent: "given this prompt, the output compiles and passes these assertions." For a content agent: "the output contains these required sections and does not exceed this word count." Run these via npx vitest run after every generation cycle. Keep test execution under 30 seconds — slow tests create feedback loops the agent ignores.
Layer 3: Contract Testing OpenClaw testing patterns verify that agent actions conform to defined protocols. Define contracts for every external interaction: API calls must hit specific endpoints with specific schemas, file writes must target allowed directories, messages must contain required metadata. Test contracts by replaying recorded agent sessions against the contract validator. Any violation means the agent has drifted from its operational specification.