GC-201a · Module 3

Context Compression & Management

3 min read

Even with 1 million tokens of context, long sessions accumulate stale information. The chatCompression setting controls how Gemini CLI manages this. "Auto" (default) compresses conversation history when approaching context limits. "Aggressive" compresses more frequently to keep context lean. "Off" never compresses — useful for sessions where you need exact conversation history preserved for debugging or documentation.

The /compact command triggers manual compression. When /stats shows high token usage and you notice response quality degrading, /compact summarizes earlier exchanges while preserving key decisions and findings. The compressed context retains the conclusions without the verbose back-and-forth that led to them. Use it proactively — do not wait for the model to start forgetting things.

# Check current context usage
/stats

# Manual compression when context is bloated
/compact

# Configure compression behavior in settings.json
# "auto" — compress when nearing limits (recommended)
# "aggressive" — compress frequently (lean sessions)
# "off" — never compress (debugging, documentation)

# Clear and start fresh when compression is not enough
/clear

Do This

Monitor /stats regularly and compress before quality degrades
Use /clear between unrelated tasks to start with clean context
Set chatCompression to "auto" for most workflows — it handles the common case well

Avoid This

Let sessions run for hours without checking context usage
Disable compression and assume 1M tokens means infinite context
Ignore the correlation between context bloat and declining response quality