GC-301c · Module 1

Cost-Quality Tradeoffs & Fallback Chains

3 min read

Cost-quality tradeoffs in model selection follow a predictable curve. Moving from Flash to Pro roughly triples cost per token while improving output quality by 15-30% on complex tasks and 2-5% on simple tasks. The marginal return on complex tasks justifies the cost. The marginal return on simple tasks does not. The optimal strategy is a bimodal distribution: Pro for the 30% of tasks that are genuinely complex, Flash for the remaining 70%. This typically reduces total cost by 40-50% compared to Pro-only usage with negligible quality impact.

Fallback chains add resilience to model selection. A fallback chain specifies a primary model and one or more fallback models that activate when the primary is unavailable (rate limited, service disruption, or context overflow). The pattern is: attempt Pro, fall back to Flash, fall back to a local model or queue the request. In custom commands, you can implement this with shell logic that checks model availability before proceeding. This is especially valuable for CI/CD pipelines and automation where a model being unavailable should not block the entire workflow.

[command]
name = "resilient-review"
description = "Code review with model fallback"
prompt = """
Perform a thorough code review of the changes in this diff:

!{git diff HEAD~1}

Focus on:
1. Correctness — does the logic handle edge cases?
2. Security — any input validation gaps or data exposure risks?
3. Performance — any O(n²) patterns or unnecessary allocations?
4. Maintainability — is the code readable and well-structured?

If you encounter rate limiting or context issues, simplify the review
to correctness and security only.
"""

Do This

Track per-task cost to identify where Pro is justified and where Flash suffices
Implement fallback chains for automated workflows that must not fail on model availability
Default to Flash and upgrade to Pro for specific tasks — the cost savings are significant

Avoid This

Optimize purely for cost by using the cheapest model everywhere — rework on complex tasks erases savings
Ignore cost entirely because "quality matters most" — the difference on routine tasks is negligible
Build automation that depends on a single model with no fallback — service disruptions will block your pipeline