The Line Between Unit Tests and Integration Tests Is Blurrier Than You Think

by Eric Hanson, Backend Developer at Clean Systems Consulting

The Definitional Argument That Goes Nowhere

Every team eventually has this argument: "Is this test a unit test or an integration test?" One developer says a unit test must test a single class in isolation. Another says any test that does not touch external infrastructure is a unit test. A third says the distinction is about speed, not scope.

All three are describing real distinctions that matter in practice. None of them are the canonical definition. The term "unit test" does not have an authoritative specification — it has a community consensus that is narrower in academic literature and broader in engineering practice, with significant variation between both.

Spending time on this argument is almost always less useful than asking the question that actually matters: does this test give me fast, reliable, actionable feedback?

The Properties That Actually Matter

Instead of categorical membership, evaluate tests on properties:

Speed. Can it run in under 100 milliseconds? Can the full suite run in under 30 seconds? Slow tests get run less often, which means slower feedback, which means the test is less valuable.

Determinism. Does it produce the same result on every run, on every machine, regardless of external state? A flaky test that occasionally fails is worse than no test in many ways — it trains developers to ignore red builds.

Isolation. Does a failure in this test tell you specifically what broke, or does it tell you "something in this general area is wrong"? A test that exercises fifteen classes to make one assertion will point you at a general region of the codebase, not a specific behavior.

Maintenance cost. Does the test break when the implementation is refactored without a behavioral change? High-maintenance tests slow down development more than they help.

Where the Boundary Gets Genuinely Blurry

Consider a test that exercises a service class with two or three real collaborators but no I/O. No database, no network, no filesystem. All in memory. Is this a unit test?

By the "single class in isolation" definition: no. By the "no external infrastructure" definition: yes. By the "fast and deterministic" criteria: yes, almost certainly.

For practical purposes, this test is fine. It is fast, deterministic, and focused. Whether it is called a unit test or an integration test has no operational significance. Put it in the "fast" suite. Run it constantly. Use it.

// Is this a unit test or an integration test?
// It uses three real collaborators, no mocks.
// But it is pure in-memory, sub-millisecond, deterministic.

describe('OrderPriceCalculator', () => {
  const taxService = new TaxService();           // Real, no I/O
  const discountService = new DiscountService(); // Real, no I/O
  const calculator = new OrderPriceCalculator(taxService, discountService);

  it('applies discount before tax', () => {
    const order = { subtotal: 100, discountCode: 'SAVE10', region: 'US_CA' };
    const result = calculator.calculate(order);
    // $100 - 10% = $90, then + 8.5% CA tax = $97.65
    expect(result.total).toBe(97.65);
  });
});

The categorical answer to "unit or integration?" is genuinely unclear. The operational answer is: put it in the fast suite, run it constantly, it is doing useful work.

When the Distinction Does Matter

The distinction between unit and integration becomes meaningful at the infrastructure boundary. A test that starts a Docker container is fundamentally different from one that does not — in startup time, in environmental requirements, in flakiness profile. These tests cannot run on every file save without slowing the feedback loop to unusability.

The practical split that matters:

  • Runs anywhere, fast (under 100ms, no Docker, no network): Run on every save, every commit, pre-push
  • Requires infrastructure (database, broker, mock HTTP server): Run on pull request, pre-deploy

That is the operationally significant boundary. Call each category whatever you want.

The teams that spend the most time arguing about definitions tend to have suites that are not optimized on either dimension. The teams that spend zero time on the argument and focus on "does this test run fast and catch real bugs?" tend to have better suites. The taxonomy is a communication tool, not an optimization target. Use the one that helps your team talk about tests clearly, and do not let it become a constraint on writing useful ones.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

When Mocking Helps Your Tests and When It Just Hides the Problem

Mocking is a legitimate and necessary technique for isolating units under test. It also enables a specific failure mode: tests that are structurally complete but behaviorally hollow, passing confidently while masking real integration problems.

Read more

Why the Best Senior Backend Developers You Have Never Heard of Are Based in Southeast Asia

The strongest contractors most Western startups have never worked with aren't hard to find. They're just not in the places founders usually look.

Read more

Designing Thread-Safe Classes in Java — Confinement, Immutability, and Synchronization

Thread safety is not a property you add after the fact — it is a design decision made at the class level. Three strategies cover nearly every case: confinement, immutability, and synchronization. Here is how to reason about which applies and how to apply it correctly.

Read more

PostgreSQL for Java Developers — The Features You Should Be Using

Most Java applications use PostgreSQL as a dumb key-value store with SQL syntax. PostgreSQL has capabilities that eliminate entire categories of application code — JSONB for flexible schemas, full-text search, window functions, advisory locks, and LISTEN/NOTIFY for real-time events.

Read more