GC-201a · Module 3
Context Compression & Management
3 min read
Even with 1 million tokens of context, long sessions accumulate stale information. The chatCompression setting controls how Gemini CLI manages this. "Auto" (default) compresses conversation history when approaching context limits. "Aggressive" compresses more frequently to keep context lean. "Off" never compresses — useful for sessions where you need exact conversation history preserved for debugging or documentation.
The /compact command triggers manual compression. When /stats shows high token usage and you notice response quality degrading, /compact summarizes earlier exchanges while preserving key decisions and findings. The compressed context retains the conclusions without the verbose back-and-forth that led to them. Use it proactively — do not wait for the model to start forgetting things.
# Check current context usage
/stats
# Manual compression when context is bloated
/compact
# Configure compression behavior in settings.json
# "auto" — compress when nearing limits (recommended)
# "aggressive" — compress frequently (lean sessions)
# "off" — never compress (debugging, documentation)
# Clear and start fresh when compression is not enough
/clear
Do This
- Monitor /stats regularly and compress before quality degrades
- Use /clear between unrelated tasks to start with clean context
- Set chatCompression to "auto" for most workflows — it handles the common case well
Avoid This
- Let sessions run for hours without checking context usage
- Disable compression and assume 1M tokens means infinite context
- Ignore the correlation between context bloat and declining response quality