Sharding Your Database Sounds Exciting Until You Actually Have to Do It

by Eric Hanson, Backend Developer at Clean Systems Consulting

What Sharding Actually Is

Sharding means horizontally partitioning your data across multiple database instances. Instead of one database holding all your data, you have N databases (shards) each holding 1/N of the data. Queries that access data on one shard execute locally. Queries that need data across shards require cross-shard coordination.

The performance benefit is real: write throughput scales with shard count, each shard's dataset is smaller (better index performance, less I/O), and hardware costs can be distributed. The operational cost is also real, and it is large.

Most teams that say they "need sharding" actually need read replicas, better query optimization, or a larger instance. Sharding is the right answer for a small class of problems. It is usually the wrong first move.

What You Give Up

Cross-shard queries become expensive or impossible. If your data is sharded by user ID, a query that aggregates data across users requires querying every shard and merging results in the application. A report that took one SQL query now requires N queries plus merge logic. If the merge is complex — sorting, grouping, deduplication — the application code bears that burden.

Transactions across shards require distributed coordination. ACID transactions within a single shard work normally. Transactions that span shards — moving a record from shard A to shard B, or updating records on two shards atomically — require two-phase commit (2PC) or saga patterns. 2PC is slow and failure-prone. Sagas are complex to implement correctly. Most applications simply avoid cross-shard transactions, which constrains the data model.

Joins across shards are gone. SQL joins work within a shard. Cross-shard joins require fetching data from multiple shards and joining in the application layer, which is typically slower and harder to optimize than a database join.

Rebalancing is painful. When you add a new shard, you need to redistribute some data to the new shard. This requires a data migration while the system is live — reading records from existing shards, writing them to the new shard, updating routing, deleting the migrated records. Done wrong, this is a data loss event.

# What "simple" sharding adds to your operational footprint:
- Shard routing logic in the application or a middleware layer
- Shard key design (wrong key = hot shards, uneven distribution)
- Cross-shard query handling (application-level aggregation)
- Distributed transaction handling for any cross-shard writes
- Rebalancing procedures when adding shards
- Per-shard monitoring and alerting
- Schema migrations across all shards simultaneously
- Backup and restore per shard

When It Is Actually Warranted

Sharding is warranted when you have exhausted these options in order:

  1. Query optimization and indexing
  2. Connection pooling (PgBouncer)
  3. Read replicas for read offload
  4. Vertical scaling the primary
  5. Caching for hot-path reads
  6. Application-level data archiving (moving old data to cold storage, reducing working set)

If you have done all of these and write throughput or primary storage is still the bottleneck, sharding may be necessary. The typical threshold where sharding becomes worth the cost is write volumes in the millions per second sustained, or datasets in the tens of terabytes where the working set no longer fits in memory.

Alternatives That Are Often Sufficient

Partitioning within a single database. PostgreSQL's table partitioning (declarative partitioning by range or list) splits a large table into physical partitions within the same database instance. Queries that filter on the partition key only scan the relevant partition. This gives some of the query performance benefit of sharding with none of the cross-shard coordination complexity.

Functional decomposition. Split different features into different databases before sharding any single database. An analytics database separate from the transactional database, each sized and scaled for its workload, is often more effective than sharding the combined database.

Managed distributed databases. CockroachDB and Google Spanner handle sharding and replication internally. You use them like a single PostgreSQL-compatible database; they handle shard management. You still pay in consistency trade-offs (CockroachDB uses serializable isolation with some distributed transaction latency; Spanner is expensive) but you offload the operational complexity.

Try everything else first. Sharding is a last resort, not a scaling strategy.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Hourly vs Project Based Pricing: What Works Better for Backend Contractors

Neither pricing model is universally better — but choosing the wrong one for the wrong engagement costs you money, time, or both.

Read more

Miami's Tech Scene Is Growing Fast — Its Backend Talent Pool Is Not Keeping Up

Miami got the founders. It got the venture capital. The senior backend engineers are still catching up.

Read more

REST vs Messaging in Microservices: Picking the Wrong One Will Hurt You

REST and asynchronous messaging are not interchangeable communication styles — they make fundamentally different promises about consistency, coupling, and failure behavior, and choosing the wrong one for a given interaction is a load-bearing architectural mistake.

Read more

How to Stay Visible to Clients Even When You Are Not Working With Them

Being top of mind with past and potential clients does not require constant selling. It requires occasional, genuine presence in their professional orbit.

Read more