CC-101 · Module 1

Context Window Anatomy

4 min read

Every Claude Code session starts with a token budget. Understanding where those tokens go is the difference between productive sessions and mysterious quality drops. Opus 4.6 gives you 200,000 tokens. But you don't get all 200,000 for your conversation. The system prompt — your CLAUDE.md, rules, and memory — consumes roughly 10,000 tokens. System tools (bash, web search, file operations) take another 17,000. MCP tools add 2,000 to 20,000 depending on how many you have installed. Auto-compaction reserves a buffer of about 33,000 tokens. What remains — somewhere between 120,000 and 170,000 tokens — is your actual working space for conversation history, file contents, and reasoning.

There's a hidden relationship between context size and output quality. Research shows a negative correlation between prompt length and response quality — the longer the context, the more likely Claude is to miss details buried in the middle. This is called the 'lost in the middle' effect. It's why a focused 50K-token conversation often produces better results than a bloated 150K-token one. Every token that isn't directly relevant to your current task is noise that degrades signal.

Extended thinking is your secret weapon for complex reasoning without burning context. When Claude uses extended thinking, the reasoning tokens don't count against your context window. This means complex multi-step logic gets full attention without pushing your conversation history out of scope. For anything requiring deep reasoning — architectural decisions, debugging complex issues, multi-file refactors — extended thinking gives you reasoning depth without context cost.