SA-201b · Module 2
Data Flow Mapping
4 min read
Data flow mapping is the architecture equivalent of a plumbing diagram. It shows where data originates, how it moves between systems, where it is transformed, where it is stored, and where it is consumed. Without a data flow map, integration debugging is guesswork. With one, you can trace any data anomaly from source to destination and identify the exact point where the flow breaks.
- Identify Data Sources Where does each piece of data originate? The CRM holds customer records. The ERP holds financial data. The AI model produces inference outputs. Each source is a starting point on the map. If you cannot identify the authoritative source for a piece of data, you have a data ownership problem that the architecture must resolve.
- Map Transformations Data rarely moves between systems in the same format. Customer names are concatenated. Dates are reformatted. Currencies are converted. Each transformation is a point on the map where data changes shape — and each transformation is a potential source of data quality issues. Document every transformation with the logic, the trigger, and the error handling.
- Identify Storage Points Where is data persisted, and for how long? Databases, caches, message queues, file systems, and third-party services all store data. Each storage point has retention, security, and compliance implications. The data flow map must show not just where data moves but where it rests.
- Document Consumption Who reads the data at each storage point? What decisions are made based on it? Data that is stored but never consumed is either waste or a latent capability. Data that is consumed by critical processes needs higher reliability guarantees than data consumed by reporting dashboards.