SA-301e · Module 3

Data Governance Architecture

4 min read

Data governance is the architecture of trust. It answers: what data do we have, where did it come from, who can access it, and how fresh is it? Without governance, the data platform produces answers that nobody trusts — and untrusted data is unused data. Governance is not a compliance exercise. It is the architectural foundation that makes data a strategic asset instead of an expensive liability.

Do This

  • Implement a data catalog that indexes every dataset with its schema, owner, lineage, and freshness
  • Track data lineage automatically — when a metric is wrong, lineage traces the problem to the source
  • Define access policies at the column level — salary data is restricted, department data is open

Avoid This

  • Rely on tribal knowledge for data discovery — "ask Sarah, she knows where the customer data is"
  • Skip lineage tracking because "we know our pipelines" — you know them today, the team in six months does not
  • Apply access policies at the dataset level only — all-or-nothing access forces data duplication for security