PM-301c · Module 2
Handling Schema Violations
5 min read
Schema violations happen in production. The question is not whether to handle them but how. Unhandled schema violations propagate through your pipeline as malformed data, causing failures at the point of consumption rather than at the point of generation — where they are much harder to diagnose.
- Detection Parse and validate immediately after generation. Do not pass the raw string to downstream systems. Try JSON.parse() (or equivalent). Validate against your schema. Log the violation with the full prompt, full response, and validation error. All three are required for root cause analysis.
- Classify the Violation Syntax error (invalid JSON) vs. schema violation (valid JSON, wrong structure). Syntax errors suggest the model did not comply with JSON mode constraints — check API configuration. Schema violations suggest the schema specification in the prompt needs strengthening.
- Retry Logic For transient violations: retry the same request with the same prompt. Models can produce valid output on a second attempt even when the first fails. Limit retries to 2 — three total attempts. If the third attempt fails, it is a systemic failure, not a transient one.
- Corrective Retry For systemic violations: retry with an augmented prompt that includes the failed output and explicit correction instruction: "Your previous response was: [bad output]. It failed schema validation because [specific reason]. Return the corrected JSON only, with no other text." This surfaces the error to the model and often produces compliance.
- Prompt Hardening If violations occur on more than 2% of requests, the prompt needs strengthening. Add more explicit constraints for the fields that are violating. Add the correction language from the retry pattern into the base prompt. Review examples for format inconsistencies.
# Corrective retry prompt
Your previous response failed JSON schema validation.
Your response was:
"""
[PASTE FAILED RESPONSE HERE]
"""
Validation error: missing required field "risk_level"
Return ONLY the corrected JSON object. Do not include any explanation,
any prose, or any text outside the JSON object. The JSON must include
all required fields: name, score, risk_level, recommendations.
Corrected JSON: