GC-301g · Module 2

Parallel Execution

3 min read

Sequential batch processing is simple but slow. Processing 100 files at 5 seconds each takes over 8 minutes. Parallel execution with xargs -P, GNU parallel, or background processes can reduce this to minutes — but introduces concurrency challenges. Rate limits become the binding constraint: Google's API enforces requests-per-minute and tokens-per-minute quotas. Exceeding them produces 429 errors that waste time on retries. The optimal parallelism level is the highest concurrency that stays under your rate limit ceiling.

GNU parallel is the most robust tool for parallel Gemini CLI execution. It handles job distribution, output ordering, progress tracking, retry on failure, and resource throttling. The --jobs flag controls concurrency, --delay adds spacing between job starts (crude rate limiting), and --joblog records success/failure per item for later retry of failures. For production batch jobs, GNU parallel with --joblog and --retry-failed is the gold standard.

#!/bin/bash
# Parallel batch with GNU parallel

# Basic parallel (4 concurrent, 1s delay between starts)
find src/api -name "*.ts" | \
  parallel --jobs 4 --delay 1 \
    'gemini -p "Review {} for security issues" --output-format json \
     > reviews/$(basename {} .ts).json 2>/dev/null'

# Production-grade with logging and retry
find src/ -name "*.ts" | \
  parallel --jobs 4 --delay 2 \
    --joblog batch.log \
    --bar \
    --halt soon,fail=20% \
    'name=$(basename {} .ts);
     gemini -p "Generate JSDoc for all exports in {}" \
       > docs/$name.md 2>logs/$name.err'

# Retry only failed jobs
parallel --retry-failed --joblog batch.log

# xargs alternative (simpler, less control)
find src/api -name "*.ts" -print0 | \
  xargs -0 -P 4 -I {} bash -c '
    name=$(basename "{}" .ts)
    gemini -p "Review {} for bugs" > "reviews/$name.json" 2>/dev/null
  '