PM-301d · Module 3
Fixing Reasoning at the Source
4 min read
Reasoning failures can be addressed at two layers: the reasoning prompt (where the failure originates) or the output post-processor (where the failure manifests). The temptation is to fix them in the post-processor — it is easier, faster, and does not require re-testing the full prompt. This is the wrong intervention when the failure is a reasoning error rather than a formatting error.
Do This
- Fix reasoning failures in the prompt: add constraints, add intermediate steps, inject premises
- Fix formatting errors in the post-processor: normalize fields, handle missing keys, cast types
- Fix semantic errors that are task-specific and varied in the post-processor only when the prompt cannot be made more specific without making it brittle
- Document which failure modes are handled at which layer — reasoning in prompt, format in parser
Avoid This
- Route reasoning failures to the post-processor because fixing the prompt is harder
- Add heuristics to the post-processor to "fix" outputs that have wrong reasoning traces
- Treat a post-processor hack as a permanent fix for a reasoning failure
- Fix both reasoning and formatting in the post-processor — this creates untraceable dependencies
- Decision: Prompt vs. Post-Processor If the failure is in the reasoning trace — wrong premise, missed constraint, invalid inference — fix it in the prompt. The post-processor cannot correct reasoning; it can only reformat outputs. If you are fixing a reasoning error in the post-processor, you are patching a broken process rather than fixing it.
- When Post-Processor Fixes Are Appropriate Format normalization (date formats, case, whitespace). Type casting (strings to numbers). Field name canonicalization. These are format errors, not reasoning errors. Post-processors are the right tool for format errors. They are the wrong tool for reasoning errors.
- Test After Fixing the Prompt After addressing a reasoning failure in the prompt, run the full test suite — not just the case that failed. Prompt changes that fix one reasoning failure can introduce regressions on other cases. Regression testing after every prompt change is not optional.