SA-101 · Module 2

Scalability Thinking

3 min read

Over-engineering is the architect's occupational hazard. The cathedral instinct is strong — the urge to design for a million users when the client has a hundred, to build a distributed system when a single server would suffice, to add caching layers for load that does not exist. Scalability thinking is not about building for maximum scale. It is about designing for the right scale — with clear paths to grow when the evidence says it is time.

The question is never "will this scale?" in the abstract. The question is: "what does this need to handle in 6 months, and what would need to change if the load doubles after that?" If the answer to the second question is "we would add a second instance behind a load balancer," the architecture is scalable. If the answer is "we would rewrite the data layer," the architecture has a scaling cliff — and you should know where the cliff is before you ship.

Do This

  • Design for current load with identified growth paths
  • Know where your scaling cliffs are — the points where the architecture breaks
  • Build stateless where possible so horizontal scaling remains an option
  • Measure before you optimize — actual bottlenecks, not imagined ones

Avoid This

  • Build for a million users on day one when you have fifty
  • Add caching, message queues, and microservices to a prototype
  • Assume that "cloud-native" automatically means "scalable"
  • Optimize for performance problems you have not measured yet