SA-101 · Module 2

Scalability Thinking

3 min read

Over-engineering is the architect's occupational hazard. The cathedral instinct is strong — the urge to design for a million users when the client has a hundred, to build a distributed system when a single server would suffice, to add caching layers for load that does not exist. Scalability thinking is not about building for maximum scale. It is about designing for the right scale — with clear paths to grow when the evidence says it is time.

The question is never "will this scale?" in the abstract. The question is: "what does this need to handle in 6 months, and what would need to change if the load doubles after that?" If the answer to the second question is "we would add a second instance behind a load balancer," the architecture is scalable. If the answer is "we would rewrite the data layer," the architecture has a scaling cliff — and you should know where the cliff is before you ship.

Do This

Design for current load with identified growth paths
Know where your scaling cliffs are — the points where the architecture breaks
Build stateless where possible so horizontal scaling remains an option
Measure before you optimize — actual bottlenecks, not imagined ones

Avoid This

Build for a million users on day one when you have fifty
Add caching, message queues, and microservices to a prototype
Assume that "cloud-native" automatically means "scalable"
Optimize for performance problems you have not measured yet