Java Thread Management — Why ExecutorService Exists and How to Use It Well
by Eric Hanson, Backend Developer at Clean Systems Consulting
The cost of raw thread creation
A Java thread maps to an OS thread. Creating one allocates a stack (512KB–1MB by default), registers the thread with the OS scheduler, and involves several system calls. On a modern server, creating and destroying a thread takes roughly 1–10ms — negligible for a one-time operation, significant for per-request work.
More importantly, threads are not free at rest. A thread pool of 200 threads holds 100–200MB of stack memory regardless of whether those threads are doing anything. An application that creates threads unboundedly under load will exhaust OS thread limits (ulimit -u) or memory before it exhausts CPU.
ExecutorService decouples task submission from thread lifecycle. Threads are created once (or on demand up to a configured limit), reused across tasks, and terminated on shutdown. The application submits work; the executor manages when and on which thread it runs.
ThreadPoolExecutor — the knobs that matter
Executors.newFixedThreadPool(n) and its siblings are convenience wrappers around ThreadPoolExecutor. Understanding ThreadPoolExecutor directly gives you control over the parameters the convenience methods don't expose:
ThreadPoolExecutor executor = new ThreadPoolExecutor(
4, // corePoolSize
8, // maximumPoolSize
60L, TimeUnit.SECONDS, // keepAliveTime for idle threads above core
new LinkedBlockingQueue<>(1000), // work queue with bounded capacity
new ThreadFactory() { // named threads for debugging
private final AtomicInteger count = new AtomicInteger(0);
public Thread newThread(Runnable r) {
Thread t = new Thread(r, "order-processor-" + count.incrementAndGet());
t.setDaemon(false);
return t;
}
},
new ThreadPoolExecutor.CallerRunsPolicy() // rejection policy
);
corePoolSize — threads kept alive even when idle. New tasks go directly to a core thread if one is available; otherwise they queue.
maximumPoolSize — upper bound on thread count. New threads above corePoolSize are created only when the queue is full. With an unbounded queue (LinkedBlockingQueue with no capacity argument), maximumPoolSize is irrelevant — the queue absorbs all tasks and extra threads are never created.
Work queue capacity — the bounded queue (new LinkedBlockingQueue<>(1000)) is critical. Executors.newFixedThreadPool uses an unbounded queue. Under sustained overload, an unbounded queue grows indefinitely — tasks accumulate, memory grows, latency climbs, and the application eventually OOMs. A bounded queue with a rejection policy is the correct production configuration.
Rejection policy — what happens when the queue is full and all threads are busy:
AbortPolicy(default) — throwsRejectedExecutionException. The caller must handle it.CallerRunsPolicy— the submitting thread executes the task itself. This slows submission naturally, creating back-pressure. Usually the right choice for CPU-bound work where the caller can afford to wait.DiscardPolicy— silently drops the task. Appropriate only when tasks are truly optional.DiscardOldestPolicy— drops the oldest queued task to make room. Appropriate for time-sensitive work where recency matters.
Monitoring pool health
A thread pool that's silently overwhelmed or permanently underutilized is a common production problem. Expose the metrics:
// Via Micrometer — registers gauges automatically
new ExecutorServiceMetrics(executor, "order-processor", Collections.emptyList())
.bindTo(Metrics.globalRegistry);
The metrics worth watching:
executor.pool.size— current thread countexecutor.queued— tasks in the work queue — the most important signalexecutor.active— threads currently executing tasksexecutor.completed— total completed task count (for throughput calculation)
A queue depth that grows steadily under normal load means the pool is undersized or tasks are taking longer than expected. A queue depth of zero with many idle threads means the pool is oversized. Neither situation is obvious without instrumentation.
Callable, Future, and CompletableFuture
ExecutorService.submit(Callable<T>) returns a Future<T> — a handle to a result that may not be available yet:
Future<ProcessingResult> future = executor.submit(() -> processOrder(order));
// ... do other work ...
try {
ProcessingResult result = future.get(5, TimeUnit.SECONDS); // blocks, with timeout
} catch (TimeoutException e) {
future.cancel(true); // interrupt the task if it's still running
throw new OrderProcessingTimeoutException(order.id());
}
future.get() without a timeout blocks indefinitely — a deadlock waiting to happen if the task hangs. Always use get(timeout, unit) for tasks that interact with external systems.
CompletableFuture is the higher-level abstraction for composing asynchronous operations. It's more expressive than Future for pipelines:
CompletableFuture<Order> result = CompletableFuture
.supplyAsync(() -> validateOrder(order), validationExecutor)
.thenApplyAsync(validated -> chargePayment(validated), paymentExecutor)
.thenApplyAsync(charged -> fulfillOrder(charged), fulfillmentExecutor)
.exceptionally(ex -> handleFailure(order, ex));
Each stage runs on the specified executor (or ForkJoinPool.commonPool() if no executor is provided). thenApplyAsync vs thenApply: the async variant runs the function on the executor; the non-async variant runs it on the thread that completed the previous stage — which may be a thread from a different pool.
The ForkJoinPool.commonPool() trap. When no executor is specified, CompletableFuture uses the common pool. The common pool is shared across the entire JVM — any library or framework using CompletableFuture without a custom executor competes for the same threads. Blocking operations in the common pool starve other users. Always provide a named executor for production CompletableFuture pipelines.
ScheduledExecutorService for periodic work
ScheduledExecutorService replaces Timer for scheduled and periodic tasks. Timer is single-threaded and swallows exceptions — an exception in a TimerTask cancels all future executions silently. ScheduledExecutorService uses a thread pool and propagates exceptions through the Future:
ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(2,
r -> new Thread(r, "scheduler-" + UUID.randomUUID()));
// One-shot delay
scheduler.schedule(() -> sendReminderEmail(userId), 24, TimeUnit.HOURS);
// Fixed rate — fires every 5 minutes regardless of task duration
scheduler.scheduleAtFixedRate(
() -> refreshCache(),
0, // initial delay
5, TimeUnit.MINUTES
);
// Fixed delay — waits 5 minutes after each completion before next execution
scheduler.scheduleWithFixedDelay(
() -> processQueue(),
0,
5, TimeUnit.MINUTES
);
scheduleAtFixedRate fires on a wall-clock schedule — if the task takes longer than the interval, executions pile up. scheduleWithFixedDelay waits for the previous execution to complete before scheduling the next — better for tasks where overlap would cause problems.
Exception handling matters: if the scheduled task throws an unchecked exception, the ScheduledFuture captures it and future executions are cancelled. Wrap the task body in try-catch if the task must continue despite errors:
scheduler.scheduleAtFixedRate(() -> {
try {
refreshCache();
} catch (Exception e) {
log.error("Cache refresh failed", e);
// does NOT cancel future executions
}
}, 0, 5, TimeUnit.MINUTES);
Virtual threads — where the model changes
Java 21 introduced virtual threads as a production feature. Virtual threads are lightweight — millions can exist simultaneously, each using a few hundred bytes rather than a megabyte stack. They're scheduled by the JVM onto a small pool of OS carrier threads rather than mapping one-to-one to OS threads.
The implication for I/O-bound applications: the thread-per-request model becomes viable again. A virtual thread that blocks on I/O doesn't hold an OS thread — it's parked by the JVM and the carrier thread serves another virtual thread. This eliminates the need for reactive/async programming styles for I/O concurrency:
// With virtual threads — simple blocking code scales like async
try (ExecutorService vExecutor = Executors.newVirtualThreadPerTaskExecutor()) {
vExecutor.submit(() -> {
String result = httpClient.get(url); // blocks the virtual thread, not an OS thread
processResult(result);
});
}
When virtual threads don't help:
- CPU-bound work. Virtual threads don't add CPU capacity — they help when threads would otherwise block. A computation-heavy task on a virtual thread still occupies a carrier thread.
- Synchronized blocks that pin. A virtual thread that enters a
synchronizedblock pins its carrier thread for the duration — the carrier is unavailable for other virtual threads. ReplacesynchronizedwithReentrantLockin code that will run on virtual threads. - Connection pools. Database connections are still finite. A million virtual threads all hitting a connection pool with 20 connections still queue on those 20 connections. Virtual threads don't eliminate resource constraints, only thread stack overhead.
For new I/O-bound services on Java 21+, Executors.newVirtualThreadPerTaskExecutor() is often the right default. For existing services, the migration path is incremental — replace thread pool executors with virtual thread executors for I/O-bound work, verify that synchronized blocks don't cause pinning, and keep CPU-bound work on platform thread pools.
Shutdown — the part that gets skipped
ExecutorService must be shut down explicitly. Without it, the JVM may not exit because non-daemon threads keep it alive:
executor.shutdown(); // stop accepting new tasks
try {
if (!executor.awaitTermination(30, TimeUnit.SECONDS)) {
executor.shutdownNow(); // interrupt running tasks
if (!executor.awaitTermination(10, TimeUnit.SECONDS)) {
log.error("Executor did not terminate");
}
}
} catch (InterruptedException e) {
executor.shutdownNow();
Thread.currentThread().interrupt();
}
shutdown() initiates orderly shutdown — queued tasks complete, no new tasks accepted. shutdownNow() attempts to interrupt running tasks by setting the interrupt flag. Tasks that don't check Thread.interrupted() or block on interruptible operations won't stop promptly.
In Spring applications, beans that hold executors should implement DisposableBean or use @PreDestroy to trigger shutdown on application context close. Executors that outlive the application context hold threads that prevent clean shutdown and may process tasks against partially torn-down infrastructure.