DR-301b · Module 3

Adversarial & Red Team Prompting

3 min read

Adversarial prompting turns the model against its own output. After producing an analysis, you prompt the model to attack it: "Now assume this analysis is wrong. What evidence would disprove it? What assumptions does it rely on? Where is the reasoning weakest?" This is not a creativity exercise — it is a systematic quality gate that surfaces blind spots the initial analysis cannot see because it was produced from a single analytical frame.

## Red Team Analysis Pattern

STEP 1: Produce the analysis normally.

STEP 2: Attack your own conclusions.
"You just produced this analysis. Now red-team it:

1. What are the 3 strongest counterarguments?
2. What evidence would DISPROVE your top conclusion?
3. Which data points are you most uncertain about?
4. What would a domain expert criticize about this
   methodology?
5. If this analysis is wrong, what is the most likely
   alternative explanation?"

STEP 3: Revise.
"Given your red-team findings, revise the original
analysis. Incorporate the strongest counterarguments.
Adjust confidence levels where warranted. Add caveats
where the red team identified genuine uncertainty."