GC-301c · Module 1
Cost-Quality Tradeoffs & Fallback Chains
3 min read
Cost-quality tradeoffs in model selection follow a predictable curve. Moving from Flash to Pro roughly triples cost per token while improving output quality by 15-30% on complex tasks and 2-5% on simple tasks. The marginal return on complex tasks justifies the cost. The marginal return on simple tasks does not. The optimal strategy is a bimodal distribution: Pro for the 30% of tasks that are genuinely complex, Flash for the remaining 70%. This typically reduces total cost by 40-50% compared to Pro-only usage with negligible quality impact.
Fallback chains add resilience to model selection. A fallback chain specifies a primary model and one or more fallback models that activate when the primary is unavailable (rate limited, service disruption, or context overflow). The pattern is: attempt Pro, fall back to Flash, fall back to a local model or queue the request. In custom commands, you can implement this with shell logic that checks model availability before proceeding. This is especially valuable for CI/CD pipelines and automation where a model being unavailable should not block the entire workflow.
[command]
name = "resilient-review"
description = "Code review with model fallback"
prompt = """
Perform a thorough code review of the changes in this diff:
!{git diff HEAD~1}
Focus on:
1. Correctness — does the logic handle edge cases?
2. Security — any input validation gaps or data exposure risks?
3. Performance — any O(n²) patterns or unnecessary allocations?
4. Maintainability — is the code readable and well-structured?
If you encounter rate limiting or context issues, simplify the review
to correctness and security only.
"""
Do This
- Track per-task cost to identify where Pro is justified and where Flash suffices
- Implement fallback chains for automated workflows that must not fail on model availability
- Default to Flash and upgrade to Pro for specific tasks — the cost savings are significant
Avoid This
- Optimize purely for cost by using the cheapest model everywhere — rework on complex tasks erases savings
- Ignore cost entirely because "quality matters most" — the difference on routine tasks is negligible
- Build automation that depends on a single model with no fallback — service disruptions will block your pipeline