PM-201b · Module 3

When NOT to Use Chain-of-Thought

3 min read

Chain-of-thought is not a universal improvement. It adds cost — in tokens, in latency, and in output length — and that cost is not always justified. On tasks where the reasoning path is irrelevant, where accuracy is already high, or where the intermediate steps are noise rather than value, CoT makes things worse, not better. The discipline is knowing when to apply it and when to leave it out.

  1. Classification tasks "Is this email spam or not spam?" A classification task with a defined label set does not benefit from step-by-step reasoning. The model classifies in a single pass. Asking it to think step by step adds tokens, may introduce overthinking, and does not improve accuracy for well-trained classification.
  2. Simple lookup and extraction "What is the invoice total?" "Extract all email addresses from this text." These are single-pass operations. The model reads the input and returns the answer. CoT adds no reasoning benefit and consumes tokens that could be used for content.
  3. Format transformations "Convert this CSV to JSON." "Reformat this date from MM/DD/YYYY to YYYY-MM-DD." Format transformations do not require reasoning — they require pattern application. CoT is overhead, not value-add.
  4. High-volume, low-stakes tasks When running prompts at scale — hundreds or thousands of calls — the token cost of CoT compounds. If accuracy is already acceptable without CoT, the cost savings from removing it are significant. Reserve CoT for tasks where the accuracy improvement materially changes the outcome.

Do This

  • Use CoT for multi-step inference, complex reasoning, planning, and trade-off analysis
  • Measure accuracy with and without CoT on representative samples before deciding
  • Apply CoT selectively to the steps in a pipeline where reasoning matters
  • Remove CoT when the task is a classification, lookup, extraction, or format transformation

Avoid This

  • Do not add "think step by step" to every prompt as a habit
  • Do not assume CoT always improves accuracy — test it
  • Do not pay for CoT reasoning tokens on tasks where accuracy is already adequate without it
  • Do not use CoT as a substitute for better grounding or constraint engineering on factual tasks