DS-301f · Module 1

The Six Quality Dimensions

3 min read

Data quality is not one thing. It is six measurable dimensions. Completeness: are all expected records present? Accuracy: do the values reflect reality? Consistency: do the same entities have the same values across systems? Timeliness: is the data current enough for the decision? Validity: do the values conform to the expected format and range? Uniqueness: is each entity represented exactly once? Each dimension is independently measurable. A dataset can be complete but inaccurate (all records present, some values wrong). It can be accurate but not timely (correct values from last month). Measuring each dimension separately reveals specific issues that a single "data quality score" would obscure.

  1. Completeness Count expected records versus actual records. Compare field population rates — what percentage of records have each required field populated? A 92% field completion rate means 8% of your records are missing data that downstream processes need.
  2. Accuracy Sample records and verify against the source of truth. If CRM records show a deal at $150K but the contract says $120K, the accuracy dimension has failed. Sample size: 5% of records monthly, stratified by source.
  3. Consistency Compare the same entity across systems. Does the customer name in the CRM match the billing system? Does the deal value in the pipeline match the signed contract? Inconsistency means at least one system is wrong — and decisions made from the wrong system will be wrong.