The Testing Pyramid Is Not a Rule. It Is a Guideline.

by Eric Hanson, Backend Developer at Clean Systems Consulting

Where the Pyramid Comes From

Mike Cohn introduced the testing pyramid in Succeeding with Agile in 2009. The idea was straightforward: unit tests are fast and cheap to run, so you should have a lot of them. End-to-end tests are slow and brittle, so you should have fewer. Integration tests sit in the middle.

This was useful guidance for a specific era of software, particularly CRUD applications with thick business logic, well-defined service boundaries, and relatively stable APIs. The pyramid works well there. The problem is that it gets taught as a universal law, applied to system architectures that look nothing like what Cohn had in mind.

When the Pyramid Does Not Fit

Consider a backend service that is primarily an integration layer — it receives events, transforms them slightly, and forwards them to three downstream APIs. The business logic is thin. The critical risk is whether the service correctly handles the various response states from those downstream APIs: rate limits, partial failures, malformed responses, retries.

A test pyramid applied literally here would produce hundreds of unit tests for the event transformation logic (which is simple and rarely breaks) and a handful of integration tests for the downstream interactions (which are where every production incident has occurred). That is the wrong shape for the risk profile.

Or consider a data pipeline service where the core complexity is in the SQL queries that aggregate and transform records. Unit tests cannot meaningfully test SQL correctness without an actual database. The test that matters runs the full query against a realistic dataset and checks the output. A pyramid-shaped suite would marginalize exactly the tests that provide real value.

Classic Pyramid:         Better Shape for Integration-Heavy Services:

      /\                        ___________
     /E2E\                     /           \
    /------\                  / Integration \
   /  Integ  \               /_______________\
  /------------\            /                 \
 /  Unit Tests  \          /    Unit Tests     \
/__________________\       /___________________\

The right shape for your test suite follows the risk profile of your system, not the shape of a diagram from a 2009 book.

What the Pyramid Gets Right

The pyramid's underlying principle is still sound: favor tests that are fast and deterministic. Slow, flaky tests that require network calls and real databases are costly to maintain and erode trust in the suite. When a test fails, you want to know immediately whether it is a real failure or a timing issue with a shared test database.

So the real question is not "how many of each type?" but "what is the cheapest test that gives me confidence about this specific behavior?"

For pure computation — parsing logic, calculation functions, state machines — unit tests are the cheapest confident test. For database query correctness — integration tests against a real (or containerized) database are the cheapest. For user-facing flows in a frontend-heavy application — maybe a small number of end-to-end tests via Playwright or Cypress are actually cheaper in the long run than maintaining hundreds of component tests that mock the API layer.

Applying This to a Real System

Before deciding on the shape of your test suite, map out where your bugs actually come from. Pull up your incident history. Look at your PR comments. Ask your team where they spend time debugging.

If bugs cluster in business logic, lean into unit tests. If they cluster in service boundaries, invest in integration tests. If they cluster in user workflows, you need end-to-end coverage.

# For a service where the logic is simple but I/O is complex,
# the high-value test is at the integration boundary

def test_downstream_rate_limit_triggers_retry_with_backoff():
    with responses.RequestsMock() as rsps:
        # First call: 429 Too Many Requests
        rsps.add(responses.POST, "https://api.downstream.com/events",
                 status=429, headers={"Retry-After": "2"})
        # Second call: 200 OK
        rsps.add(responses.POST, "https://api.downstream.com/events",
                 json={"status": "accepted"}, status=200)

        result = forward_event({"type": "order.created", "id": "123"})

        assert result.success is True
        assert len(rsps.calls) == 2  # Confirm retry happened

This test is not unit or end-to-end — it is integration-level, and it is the most valuable test in this hypothetical service because it directly addresses the failure mode that matters most.

The pyramid is a starting heuristic. It tells you that before you write an end-to-end test, ask yourself whether a unit test would do the job. That question is worth asking every time. The answer is not always yes.

Write tests in the shape of your risks. Use the pyramid as a prompt for that conversation, not as a quota to fulfill.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Error Handling in Ruby — Beyond Rescue and Raise

Most Ruby codebases use rescue and raise for everything, which conflates recoverable domain failures with unexpected system errors. Here is a structured approach to error handling that scales past a few controllers.

Read more

Your Docker Image Has More Inside It Than You Think

Most developers have a rough sense of what's in their Docker image, but very few have actually inspected it. What you find when you look closely — sensitive files, debug tools, unused binaries — is usually a surprise.

Read more

How Legacy Systems Trap Engineering Teams

Legacy systems can feel like a trap—working, but only barely, and often at the cost of the team trying to maintain them.

Read more

Designing with Java Enums — When They're the Right Model and When They're Not

Java enums are more capable than most developers use them for, but that capability has limits. Here is a clear-eyed look at what enums do well, where they break down, and the design decisions that determine which side you end up on.

Read more