PM-301d · Module 3

Fixing Reasoning at the Source

4 min read

Reasoning failures can be addressed at two layers: the reasoning prompt (where the failure originates) or the output post-processor (where the failure manifests). The temptation is to fix them in the post-processor — it is easier, faster, and does not require re-testing the full prompt. This is the wrong intervention when the failure is a reasoning error rather than a formatting error.

Do This

  • Fix reasoning failures in the prompt: add constraints, add intermediate steps, inject premises
  • Fix formatting errors in the post-processor: normalize fields, handle missing keys, cast types
  • Fix semantic errors that are task-specific and varied in the post-processor only when the prompt cannot be made more specific without making it brittle
  • Document which failure modes are handled at which layer — reasoning in prompt, format in parser

Avoid This

  • Route reasoning failures to the post-processor because fixing the prompt is harder
  • Add heuristics to the post-processor to "fix" outputs that have wrong reasoning traces
  • Treat a post-processor hack as a permanent fix for a reasoning failure
  • Fix both reasoning and formatting in the post-processor — this creates untraceable dependencies
  1. Decision: Prompt vs. Post-Processor If the failure is in the reasoning trace — wrong premise, missed constraint, invalid inference — fix it in the prompt. The post-processor cannot correct reasoning; it can only reformat outputs. If you are fixing a reasoning error in the post-processor, you are patching a broken process rather than fixing it.
  2. When Post-Processor Fixes Are Appropriate Format normalization (date formats, case, whitespace). Type casting (strings to numbers). Field name canonicalization. These are format errors, not reasoning errors. Post-processors are the right tool for format errors. They are the wrong tool for reasoning errors.
  3. Test After Fixing the Prompt After addressing a reasoning failure in the prompt, run the full test suite — not just the case that failed. Prompt changes that fix one reasoning failure can introduce regressions on other cases. Regression testing after every prompt change is not optional.