DS-101 · Module 1

Reading Between the Lines

3 min read

Data shows you what happened. It does not show you why, and it does not show you what it missed. The most important skill in data literacy is not reading the chart — it is reading what the chart leaves out.

Survivorship bias is the silent killer of data-driven decisions. When you study only successful outcomes, you draw conclusions that do not account for the failures. "Every successful startup had a strong social media presence" sounds compelling until you realize that most failed startups also had a strong social media presence. You are looking at survivors and mistaking their traits for causes. The missing data — the failures — is where the real insight lives.

  1. Ask what data is missing Every dataset excludes something. Customer satisfaction surveys exclude the customers who were too frustrated to respond. Sales reports exclude the deals that never entered the pipeline. Understand what the collection method missed before trusting the results.
  2. Ask who is not represented If your data comes from a specific channel, it only represents people who use that channel. Website analytics miss the prospects who called directly. Email surveys miss the customers who unsubscribed. The people not in your data may be the ones you most need to understand.
  3. Ask what would change your mind Before you make a decision, define the data that would reverse it. If no data could change your mind, you are not making a data-driven decision — you are using data to confirm what you already believe.