CC-201a · Module 4

The 5-Feature Decision Matrix

4 min read

Claude Code gives you five distinct customization features, and each one occupies a different position on the context-cost spectrum. CLAUDE.md is always-on — it loads into every session, consuming roughly 100% of its token footprint whether the rules are relevant or not. Skills are on-demand — only the description sits in context at all times, loading the full instructions when activated, costing approximately 15% relative to an equivalent CLAUDE.md entry. MCP servers register their tool definitions at session start, landing at roughly 45% context cost because every tool schema sits in memory regardless of usage. Sub-agents run in their own isolated context window, costing your main session zero tokens. Hooks execute as shell commands on system events, also costing zero tokens in your context. These are not interchangeable. Each feature has a specific trigger condition, a specific cost profile, and a specific failure mode when misapplied.

The decision framework reduces to one question per feature. Should Claude always know this? Put it in CLAUDE.md. Should Claude know this only when the task demands it? Make it a Skill. Should this work run in isolation with its own context window? Delegate to a Sub-agent. Should this happen automatically on every matching event, regardless of what was asked? Wire it as a Hook. Does Claude need to reach an external service for data or actions? Connect an MCP server. One question, one answer, no ambiguity. The developers who struggle with Claude Code configuration are almost always putting the right content in the wrong feature. A deployment checklist in CLAUDE.md wastes tokens on every debugging session. A critical safety rule in a Skill risks not loading when needed most.

The cost of misplacement is measurable. A team that crams everything into CLAUDE.md — coding standards, deployment procedures, PR checklists, style guides — might burn 5,000 tokens of overhead per session on rules that apply to 10% of tasks. Over a full day of development across a five-person team, that is tens of thousands of wasted tokens and noticeably earlier compaction triggers. Conversely, a team that moves a critical safety rule like "never modify the production database directly" into a Skill risks that rule not activating during a session where someone is debugging a data issue and inadvertently runs a mutation. The matrix is not about optimization for its own sake. It is about putting each instruction where it will fire reliably at the lowest possible cost.

1. CLAUDE.md — Always-On Instructions Loads every session, every task, no exceptions. Put non-negotiable standards here: language preferences, build commands, architectural constraints. Cost: ~100% of its token size, paid on every session. Decision rule: if Claude should always know it, it belongs here.
2. Skills — On-Demand Expertise Only the description sits in context; full instructions load when the task matches. Put task-specific procedures here: PR review checklists, deployment workflows, commit message formats. Cost: ~15% (description only). Decision rule: if Claude should know it sometimes, make it a Skill.
3. Sub-agents — Isolated Workers Run in their own context window, completely separate from your main session. Put delegatable work here: code exploration, documentation research, bulk file analysis. Cost: 0% to your main context. Decision rule: if it should run in isolation, use a Sub-agent.
4. Hooks — Event-Driven Automation Shell commands that fire on specific lifecycle events: pre-tool, post-tool, session start, session stop. Put mechanical guardrails here: auto-formatting, destructive command blocking, post-edit test runs. Cost: 0% context tokens. Decision rule: if it should happen automatically on every matching event, wire a Hook.
5. MCP Servers — External Integrations Connect Claude to external services via the Model Context Protocol. Put external data needs here: GitHub PRs, Sentry errors, database queries, Figma designs. Cost: ~45% (all tool schemas registered at session start). Decision rule: if Claude needs external tools or data, connect an MCP server.