AS-301d · Module 3

Building a Red Team Program

3 min read

Good news, everyone! A red team is not a penetration test. A penetration test checks whether known vulnerabilities exist. A red team simulates a motivated adversary who will adapt, improvise, and persist until they find a path through your defenses. For AI systems, red teaming means systematically attempting every injection technique — direct, indirect, encoded, multi-turn, tool-exploiting — against your production defenses, documenting what works, and using the findings to strengthen every layer.

Do This

  • Run red team exercises quarterly against production defenses — not staging, not a sanitized test environment
  • Vary the red team composition — internal testers develop blind spots, external testers bring fresh perspectives
  • Document every successful bypass as a regression test case that runs in CI/CD

Avoid This

  • Test only the injection patterns you already know about — the value of red teaming is discovering what you missed
  • Stop after the first successful bypass — find all the paths, not just the first one
  • Run red team exercises without remediation follow-up — findings without fixes are findings wasted