CC-301i · Module 1

CLAUDE.md as Persistent Cache

3 min read

CLAUDE.md is loaded at the start of every session. This makes it the most efficient form of persistent context — information that Claude needs across all sessions, documented once, available always. Every fact in CLAUDE.md is a fact you never need to re-explain. Every rule is a correction you never need to repeat. The token cost is paid once per session; the benefit accrues across every prompt in that session.

Think of CLAUDE.md as a level-one cache. The information Claude needs most frequently — build commands, key file locations, architectural patterns, naming conventions — should be in CLAUDE.md because it is loaded before the first prompt. Information Claude needs occasionally — specific API schemas, deployment procedures, one-off configurations — can be in separate files that Claude reads on demand. Information Claude needs rarely — historical decisions, deprecated patterns, old migration guides — should not be loaded at all unless specifically requested.

L1 Cache: CLAUDE.md Loaded every session. Contains: project identity, hard rules, build commands, architecture overview, key file locations, naming conventions. Budget: 200 lines. Token cost: ~2000 tokens per session, amortized across all prompts.
L2 Cache: Referenced Files Loaded on demand when Claude needs them. Contains: API schemas, component patterns, test infrastructure, deployment checklists. Referenced in CLAUDE.md by path so Claude knows where to find them.
L3 Cache: Deep Context Loaded only when explicitly requested. Contains: historical decisions, deprecated patterns, migration guides. Not referenced in CLAUDE.md. The user provides the path when needed.