CDX-301i · Module 3

Rate Limiting & Budget Controls

3 min read

Budget controls prevent runaway costs — the enterprise equivalent of setting a credit card limit. Without controls, a misconfigured pipeline can burn through thousands of dollars in minutes: a retry loop without a budget ceiling, a parallel spawn without a concurrency limit, or a recursive agent that keeps invoking itself. Budget controls operate at multiple levels: per-task limits (no single task exceeds $X), per-pipeline limits (no pipeline run exceeds $Y), per-team daily limits (team Z cannot spend more than $W per day), and system-wide monthly limits.

Rate limiting complements budget controls by capping the volume of API calls per time window. Even within budget, a burst of concurrent requests can trigger API rate limits, causing cascading failures across all pipelines. Client-side rate limiting (enforced by the dispatcher) prevents this by throttling task dispatch to stay within API limits. The rate limiter should be aware of model-specific limits — different models may have different rate ceilings.

from dataclasses import dataclass
from datetime import datetime, timedelta

@dataclass
class BudgetPolicy:
    per_task_max_usd: float          # Single task ceiling
    per_pipeline_max_usd: float      # Pipeline run ceiling
    daily_team_max_usd: float        # Team daily budget
    monthly_system_max_usd: float    # System monthly budget
    requests_per_minute: int          # API rate limit

class BudgetEnforcer:
    def __init__(self, policy: BudgetPolicy):
        self.policy = policy
        self.spending: dict[str, float] = {}  # team -> daily spend
        self.system_monthly: float = 0.0

    def can_submit(self, team: str, estimated_cost: float) -> bool:
        """Check all budget gates before allowing task submission."""
        # Per-task check
        if estimated_cost > self.policy.per_task_max_usd:
            return False
        # Daily team check
        team_spent = self.spending.get(team, 0.0)
        if team_spent + estimated_cost > self.policy.daily_team_max_usd:
            return False
        # Monthly system check
        if (self.system_monthly + estimated_cost
                > self.policy.monthly_system_max_usd):
            return False
        return True

    def record_spend(self, team: str, actual_cost: float):
        self.spending[team] = self.spending.get(team, 0) + actual_cost
        self.system_monthly += actual_cost

Do This

  • Enforce budget limits at every level: task, pipeline, team daily, and system monthly
  • Estimate costs before submission and reject tasks that would breach budget limits
  • Implement client-side rate limiting to stay within API provider limits proactively
  • Account for cascading costs — a cheap supervisor task that spawns expensive workers

Avoid This

  • Rely solely on API provider rate limits — they cause errors; client-side throttling prevents them
  • Set budgets without historical data — measure actual costs for 2-4 weeks before setting limits
  • Ignore per-task limits because "most tasks are cheap" — one runaway task can consume the daily budget
  • Forget to reset daily/monthly counters — stale budget data blocks legitimate work