KM-201a · Module 1

Choosing a Taxonomy: Flat, Hierarchical, or Faceted

5 min read

The taxonomy decision is the most consequential architectural choice in knowledge system design and the one most commonly made by default. Teams pick a structure because it mirrors their org chart, or because it matches the way a previous system was organized, or because the first person to set up the wiki made folders that felt logical at the time. None of those are taxonomy design processes. They are accidents that become permanent.

A taxonomy has one job: make any piece of knowledge findable by someone who does not know where it is stored. That means the classification system must be predictable. A new employee — or an AI retrieval system — must be able to look at a piece of knowledge and correctly predict where it belongs without consulting the person who built the taxonomy. Unpredictability is the failure mode. When different people would classify the same document differently, the taxonomy has failed regardless of how logical it felt to its designer.

  1. Flat Taxonomy A single-level classification with no hierarchy. Every knowledge artifact has one or more tags from a controlled vocabulary. Advantages: simple to maintain, easy to query, no ambiguity about which level of the hierarchy to use. Disadvantages: poor for large corpora — hundreds of tags become unwieldy, and relationships between concepts are not represented. Best for: small teams, narrow domains, or as the tag layer on top of a hierarchical structure.
  2. Hierarchical Taxonomy A tree structure where categories contain subcategories. The most common structure for enterprise knowledge bases because it mirrors the mental model most people have for organizing information. Advantages: intuitive navigation, clear parent-child relationships, scales well for large corpora. Disadvantages: documents that belong in multiple branches create tension — you either duplicate or make an arbitrary choice. The organizational politics of hierarchy are real: whose team's folder does this document live in? Best for: domain-specific knowledge with clear categorical structure.
  3. Faceted Taxonomy Multiple independent classification axes applied simultaneously. A document can be classified by content type (policy, procedure, reference), by domain (sales, engineering, operations), by audience (all-staff, managers, technical), and by status (current, under-review, deprecated) — all at once. Advantages: flexible, powerful filtering, no forced choice between branches. Disadvantages: requires disciplined application of all facets; inconsistently tagged documents are hard to find. Best for: large, cross-functional knowledge bases where documents serve multiple audiences and contexts.

Most mature enterprise knowledge systems use a hybrid: a hierarchical primary structure for primary navigation combined with faceted tags for filtering and cross-cutting retrieval. The hierarchy gives users a browsable structure. The facets give them filters. A user who knows they need something in the Engineering domain but cannot remember whether it is under Architecture, Processes, or Standards can filter by facet to find it across all three branches.

The choice between structures should be driven by user research, not by the knowledge architect's aesthetic preference. Ask three questions: How do users most commonly arrive at knowledge — do they browse or do they search? How cross-functional is the knowledge — does it serve multiple teams or primarily one? How often do documents genuinely belong to multiple categories? The answers determine the structure.

One common failure mode deserves specific attention: the organizational chart taxonomy. This is a hierarchical structure where the top-level categories map directly to business units — Sales, Marketing, Engineering, HR, Finance. It feels logical because it matches reporting structure. It fails because knowledge does not respect org charts. The sales process involves finance's pricing models, engineering's technical specs, and legal's contract terms. A knowledge base organized by org unit forces users to know which team 'owns' a piece of information before they can find it. That is the opposite of how knowledge retrieval should work.