KM-201a · Module 2
The Knowledge Schema: Structuring Articles, Processes, Decisions, and References
4 min read
A knowledge schema defines the structure of individual knowledge artifacts — what fields they contain, what format each field follows, and what metadata is required for publication. Without a schema, every contributor structures their knowledge differently, resulting in a knowledge base that can be browsed but not systematically queried, audited, or maintained.
Different knowledge types require different schemas. A policy document has a different structure than a technical runbook, which has a different structure than a decision record, which has a different structure than a reference guide. Applying a single universal schema to all knowledge types produces documents that are either over-specified (policy documents forced into a field structure designed for procedures) or under-specified (runbooks missing the step-by-step structure that makes them executable).
- Policy Schema Required fields: title (content-type-first naming convention), owner, effective date, review date, scope (who is subject to this policy), policy statement, rationale, exceptions, related policies. The rationale field is the most commonly omitted and the most valuable for long-term maintenance — policies without rationale get overridden or ignored when nobody understands why they exist.
- Procedure / Runbook Schema Required fields: title, owner, last verified date, prerequisites (what the performer needs to know or have access to before starting), steps (structured as numbered items with enough detail to transfer capability), verification steps (how to confirm the procedure was performed correctly), escalation path (what to do if a step fails). The last verified date distinguishes a current runbook from one that was accurate when written and has not been touched since.
- Decision Record Schema Required fields: title (decision statement, not a generic label), date, decision-makers, context (what situation prompted this decision), options considered (not just the one chosen), decision (what was decided and why), consequences (expected outcomes and known tradeoffs), status (proposed, accepted, deprecated, superseded). Decision records are the most underused knowledge type and the one with the highest long-term value — they answer the 'why does this work this way' question that otherwise requires finding the person who was in the room.
- Reference Schema Required fields: title, owner, last updated date, audience (who this reference serves), content (the reference material), related procedures (where this reference is applied). References are the supporting layer — they provide context and data that procedures and policies draw from. A pricing reference document, a technical specification, a regulatory requirement summary. The audience field drives personalization in retrieval: surface this reference when the relevant audience is querying the system.
Schema enforcement is the second major governance challenge after naming conventions. The temptation is to make fields optional to reduce contribution friction. Optional fields are not filled in. Within six months of launch, the percentage of documents with complete metadata will be inversely proportional to the number of optional fields. Make the minimum viable set of fields required at publication. Everything else can be recommended but not enforced.
For AI retrieval specifically, the metadata schema is critical infrastructure. A RAG system that can filter by document type, owner, date, and audience surfaces far more precise results than one that can only search the full text. Before implementing AI retrieval on a knowledge base, audit whether the existing schema provides enough metadata to support filtered queries. If not, the schema needs to be updated and the existing documents need to be enriched — which is a significant migration project but a necessary prerequisite for high-quality AI retrieval.