The Simplest System That Solves the Problem Is Almost Always the Right One
by Eric Hanson, Backend Developer at Clean Systems Consulting
The Bias Toward Complexity
Engineering culture has a bias toward complexity. Complex systems are more interesting to build. They signal technical sophistication. They fill architecture diagrams in ways that simple systems do not. A system with a message queue, a service mesh, a distributed cache, and six microservices feels more "serious" than a monolith with a PostgreSQL database.
This bias is expensive. Complex systems have more failure modes, more operational burden, more surfaces for bugs, and more things for engineers to learn and maintain. Every piece of complexity should be there because it is solving a specific problem that simpler alternatives cannot. Complexity that exists for its own sake — or to signal technical maturity — is waste.
The simplest system that correctly solves the problem is not a starting point you iterate away from. It is often the destination.
What Simplicity Actually Means
Simplicity is not the same as naive. A simple system is one where every component and design decision can be justified by a specific problem it solves. It is not a system with fewer features. It is a system without components that do not earn their presence.
A PostgreSQL database with proper indexing and connection pooling is simple. It is also capable of handling most application data requirements up to significant scale. Adding a distributed cache, a sharding layer, and a read replica before you have evidence that the PostgreSQL instance is the bottleneck is not an upgrade to the simple system — it is complexity without justification.
A monolith deployed to two instances behind a load balancer is simple. It can handle tens of thousands of daily active users. Splitting it into 12 microservices before you have team-scale deployment bottlenecks or dramatically different scaling requirements is complexity without justification.
# The simplicity test for any proposed component:
1. What specific problem does this component solve?
2. Has that problem manifested in the current system?
3. What is the simplest alternative that addresses it?
4. Why is this component better than the simpler alternative for this specific case?
If you cannot answer all four questions clearly,
the component has not earned its place in the design.
The Cost of Each Layer of Complexity
Every layer added to a system has ongoing costs that compound:
- Operational knowledge: engineers must understand the component to operate it correctly
- Failure modes: each component can fail, and its failure mode must be understood and handled
- Testing surface: integration tests must cover interactions with the new component
- Debugging overhead: production issues now span more components, requiring more context to diagnose
- Onboarding cost: new engineers must learn the system including this component
These costs are invisible in the architecture diagram but very visible in the engineering team's time. A system that requires 40% of engineering time in operational maintenance is one where 40% of engineering time is not going into the product.
When Complexity Is Warranted
Complexity earns its place when a simpler alternative cannot meet a specific, validated requirement. The requirement must be current — not anticipated. The failure of the simpler alternative must be demonstrated — not assumed.
Read replicas earn their place when the primary is demonstrably the read bottleneck. Microservices earn their place when independent deployment becomes a demonstrated bottleneck or when scaling requirements are demonstrably different between domains. A message queue earns its place when synchronous processing demonstrably blocks the critical path or when retry and durability requirements cannot be met without it.
The word "demonstrably" is doing a lot of work here. Demonstrably means you have data — performance metrics, incident postmortems, throughput numbers — that shows the simpler approach failing. Not "we think it might fail at scale." Actual failure.
The Practical Implication
When designing a system, start by asking: what is the simplest possible implementation that correctly handles the known requirements? Build that. Measure it. When it fails a specific requirement — not when you anticipate it might — add the complexity that addresses that specific failure.
This is not an argument against ambition. It is an argument against paying complexity costs now for problems you have not yet confirmed you have. The simplest system that works is almost always easier to operate, easier to debug, easier to scale from, and easier to change than a complex system designed for imagined requirements.
Build what solves the problem. Nothing more.