The Line Between Unit Tests and Integration Tests Is Blurrier Than You Think
by Eric Hanson, Backend Developer at Clean Systems Consulting
The Definitional Argument That Goes Nowhere
Every team eventually has this argument: "Is this test a unit test or an integration test?" One developer says a unit test must test a single class in isolation. Another says any test that does not touch external infrastructure is a unit test. A third says the distinction is about speed, not scope.
All three are describing real distinctions that matter in practice. None of them are the canonical definition. The term "unit test" does not have an authoritative specification — it has a community consensus that is narrower in academic literature and broader in engineering practice, with significant variation between both.
Spending time on this argument is almost always less useful than asking the question that actually matters: does this test give me fast, reliable, actionable feedback?
The Properties That Actually Matter
Instead of categorical membership, evaluate tests on properties:
Speed. Can it run in under 100 milliseconds? Can the full suite run in under 30 seconds? Slow tests get run less often, which means slower feedback, which means the test is less valuable.
Determinism. Does it produce the same result on every run, on every machine, regardless of external state? A flaky test that occasionally fails is worse than no test in many ways — it trains developers to ignore red builds.
Isolation. Does a failure in this test tell you specifically what broke, or does it tell you "something in this general area is wrong"? A test that exercises fifteen classes to make one assertion will point you at a general region of the codebase, not a specific behavior.
Maintenance cost. Does the test break when the implementation is refactored without a behavioral change? High-maintenance tests slow down development more than they help.
Where the Boundary Gets Genuinely Blurry
Consider a test that exercises a service class with two or three real collaborators but no I/O. No database, no network, no filesystem. All in memory. Is this a unit test?
By the "single class in isolation" definition: no. By the "no external infrastructure" definition: yes. By the "fast and deterministic" criteria: yes, almost certainly.
For practical purposes, this test is fine. It is fast, deterministic, and focused. Whether it is called a unit test or an integration test has no operational significance. Put it in the "fast" suite. Run it constantly. Use it.
// Is this a unit test or an integration test?
// It uses three real collaborators, no mocks.
// But it is pure in-memory, sub-millisecond, deterministic.
describe('OrderPriceCalculator', () => {
const taxService = new TaxService(); // Real, no I/O
const discountService = new DiscountService(); // Real, no I/O
const calculator = new OrderPriceCalculator(taxService, discountService);
it('applies discount before tax', () => {
const order = { subtotal: 100, discountCode: 'SAVE10', region: 'US_CA' };
const result = calculator.calculate(order);
// $100 - 10% = $90, then + 8.5% CA tax = $97.65
expect(result.total).toBe(97.65);
});
});
The categorical answer to "unit or integration?" is genuinely unclear. The operational answer is: put it in the fast suite, run it constantly, it is doing useful work.
When the Distinction Does Matter
The distinction between unit and integration becomes meaningful at the infrastructure boundary. A test that starts a Docker container is fundamentally different from one that does not — in startup time, in environmental requirements, in flakiness profile. These tests cannot run on every file save without slowing the feedback loop to unusability.
The practical split that matters:
- Runs anywhere, fast (under 100ms, no Docker, no network): Run on every save, every commit, pre-push
- Requires infrastructure (database, broker, mock HTTP server): Run on pull request, pre-deploy
That is the operationally significant boundary. Call each category whatever you want.
The teams that spend the most time arguing about definitions tend to have suites that are not optimized on either dimension. The teams that spend zero time on the argument and focus on "does this test run fast and catch real bugs?" tend to have better suites. The taxonomy is a communication tool, not an optimization target. Use the one that helps your team talk about tests clearly, and do not let it become a constraint on writing useful ones.