Mocking Everything in Your Tests Is a Sign Something Is Wrong
by Arif Ikhsanudin, Backend Developer
The Test That Is Mostly Mocks
You have seen this test. Probably written it.
@ExtendWith(MockitoExtension.class)
class OrderServiceTest {
@Mock private OrderRepository orderRepository;
@Mock private PaymentGateway paymentGateway;
@Mock private InventoryService inventoryService;
@Mock private NotificationService notificationService;
@Mock private AuditLogger auditLogger;
@Mock private PricingEngine pricingEngine;
@Mock private FraudDetectionService fraudDetection;
@InjectMocks private OrderService orderService;
@Test
void processOrder_success() {
Order order = new Order(/* ... */);
when(orderRepository.findById(1L)).thenReturn(Optional.of(order));
when(pricingEngine.calculate(order)).thenReturn(new PricingResult(150.00));
when(fraudDetection.check(order)).thenReturn(FraudResult.CLEAR);
when(inventoryService.reserve(order)).thenReturn(true);
when(paymentGateway.charge(order, 150.00)).thenReturn(PaymentResult.SUCCESS);
orderService.processOrder(1L);
verify(auditLogger).log(eq(1L), eq("ORDER_PROCESSED"), any());
verify(notificationService).sendConfirmation(order);
}
}
This test has seven mocks and two final assertions. If you delete orderService.processOrder(1L) from the test body, the mocks configure fine and the test fails only because the verifications are not satisfied. The test is not checking that the order processing logic is correct — it is checking that OrderService calls its dependencies in a specific sequence with specific arguments.
What This Test Is Actually Telling You
A test that requires seven mocks is telling you that the class under test has seven dependencies. That is too many. The Single Responsibility Principle suggests a class should have one reason to change. A class with seven dependencies has at least seven potential reasons to change.
The testing pain is the feedback. The class needs to be decomposed.
When you decompose the class, each smaller piece has fewer dependencies, is easier to test with fewer mocks, and the resulting tests verify actual logic rather than wiring.
// Before decomposition: OrderService does everything
// After decomposition: separate concerns
class OrderValidator {
// Depends only on FraudDetectionService
ValidationResult validate(Order order) { ... }
}
class OrderPricer {
// Depends only on PricingEngine
PricingResult price(Order order) { ... }
}
class OrderFulfillmentService {
// Depends on InventoryService, PaymentGateway
FulfillmentResult fulfill(Order order, PricingResult price) { ... }
}
class OrderService {
// Orchestrates, depends on the above + repo + notification + audit
void processOrder(long orderId) { ... }
}
Now each class can be tested with one or two mocks, and the tests verify actual logic. OrderValidator tests can verify that a flagged fraud check produces a validation failure. OrderPricer tests can verify that discounts are applied correctly. The wiring — that OrderService calls these in the right order — is tested once, at the OrderService level, with lighter mocking because the logic has moved out.
When Mocking Is the Right Choice
Mocking is appropriate for dependencies that are expensive, non-deterministic, or have real external effects:
- Infrastructure dependencies (databases, HTTP clients, message queues): always mock or use a fake in unit tests
- Services with side effects (sending emails, processing payments): mock in unit tests, test the real behavior in integration tests
- Non-deterministic sources (current time, random numbers): inject a controllable interface and use a deterministic implementation in tests
Mocking is a smell when used for:
- Your own domain objects that are fast and pure
- Simple collaborators you own that have no side effects
- Classes whose interaction with the class under test is itself the thing worth testing
The Fake as an Alternative
For collaborators you own and control, a hand-written fake is often better than a mock. A fake is a simplified but real implementation that behaves correctly without I/O:
class InMemoryOrderRepository:
def __init__(self):
self._store = {}
def save(self, order: Order) -> Order:
self._store[order.id] = order
return order
def find_by_id(self, order_id: int) -> Order | None:
return self._store.get(order_id)
This fake is used in tests instead of a mock. It does not require when(...) declarations. It behaves like a real repository, just in memory. Tests using it are testing actual behavior rather than mock interaction, and they survive refactors that change internal call patterns.
The goal is tests that verify your code does the right thing, not tests that verify your code calls the right things. When mocks are verifying call sequences more than they are enabling behavioral assertions, the design — not the tests — needs to change.