KM-301a · Module 3

Taxonomy Migration

6 min read

At some point, the taxonomy must change. A top-level category needs to split. Two categories need to merge. A faceted system needs to replace a hierarchical one. The challenge: a live knowledge base cannot simply be taken offline for reclassification. Users are reading and writing during the migration. Search indexes need to remain coherent. The migration must be reversible if something goes wrong.

This is the ATLAS framework for taxonomy migration: map, stage, migrate in batches, validate, and deprecate — never delete.

  1. Map: Build the Complete Reclassification Plan Before touching a single item, produce a complete mapping: every old category to its new category or categories. For items that split across new categories, document the decision rules — which items go where, and how a contributor could determine the right classification for new content going forward. The map is the specification. Migration without a complete map produces inconsistent results and unresolvable edge cases mid-migration.
  2. Stage: Run Shadow Classification Apply new taxonomy tags to content while keeping old tags in place. Both classification systems are live simultaneously. This allows search and navigation to continue working under the old taxonomy while the new one is validated. Run search quality tests against both taxonomies in parallel. Users do not see the new structure yet; you are validating it against real content before the cutover.
  3. Migrate in Batches with Rollback Gates Migrate content in category-by-category batches, not all at once. After each batch, run automated validation: item count matches, no orphan content, search results for canonical test queries return expected items. Define rollback criteria before starting — if batch N fails validation, what is the rollback procedure? Rollback must be executable without manual reclassification.
  4. Deprecate, Never Delete Old categories are deprecated, not deleted. A deprecated category is invisible to new contributors but still resolves for existing content and bookmarks. Set a deprecation window (typically 90 days). After 90 days, deprecated categories are archived — still accessible by direct URL or admin query, but removed from navigation. Deletion is a one-way door; deprecation gives you a reversal path.
interface MigrationBatch {
  oldCategory: string;
  newCategories: string[];
  itemCount: number;
}

interface ValidationResult {
  passed: boolean;
  errors: string[];
  warnings: string[];
}

async function validateMigrationBatch(
  batch: MigrationBatch,
  kb: KnowledgeBaseClient
): Promise<ValidationResult> {
  const errors: string[] = [];
  const warnings: string[] = [];

  // 1. Item count integrity: no items lost or duplicated
  const migratedCount = await kb.countItemsInCategories(batch.newCategories);
  if (migratedCount !== batch.itemCount) {
    errors.push(
      `Item count mismatch: expected ${batch.itemCount}, found ${migratedCount}`
    );
  }

  // 2. No orphan content: every migrated item has at least one new category
  const orphans = await kb.findOrphanItems(batch.oldCategory, batch.newCategories);
  if (orphans.length > 0) {
    errors.push(`${orphans.length} items have no new category assignment`);
  }

  // 3. Search regression: canonical test queries still return expected items
  const testQueries = await kb.getTestQueriesForCategory(batch.oldCategory);
  for (const query of testQueries) {
    const results = await kb.search(query.text);
    const topItemIds = results.slice(0, 5).map(r => r.id);
    if (!topItemIds.includes(query.expectedTopItemId)) {
      warnings.push(`Search regression: query "${query.text}" no longer surfaces expected item`);
    }
  }

  return {
    passed: errors.length === 0,
    errors,
    warnings
  };
}