Java Memory Leaks in Practice — How They Form and How to Find Them

by Eric Hanson, Backend Developer at Clean Systems Consulting

What a Java memory leak actually is

In C, a memory leak means allocating memory and losing the pointer to it — the memory can never be freed. Java's GC prevents that specific failure. A Java memory leak is different: an object is reachable — there is a reference chain from a GC root to the object — but the object is no longer needed by the application. The GC cannot collect it because it's reachable. The application doesn't use it because the logic no longer needs it. The object sits in the heap indefinitely, accumulating alongside every other such object until OutOfMemoryError.

The practical definition: a Java memory leak is an unintentional reference that prevents GC from collecting objects the application considers logically dead.

The patterns that cause them

Static collections that grow without bound

A static field is a GC root — the GC never collects it. Any object reachable from a static field lives for the process lifetime. A static collection that accumulates entries without a removal policy is a leak:

public class RequestTracker {
    // Every Request added here lives until the process dies
    private static final List<Request> completedRequests = new ArrayList<>();

    public static void track(Request request) {
        completedRequests.add(request);
    }
}

The fix depends on intent. If you need a bounded history: new ArrayDeque<>() with a max-size cap. If you need a time-bounded cache: Caffeine.newBuilder().expireAfterWrite(1, TimeUnit.HOURS).build(). If you don't actually need it: remove it.

Listeners and callbacks not removed

Event-driven systems register listeners. If the listener holds a reference to a large object and is never deregistered, both the listener and everything it references accumulate:

public class Dashboard {
    private final DataService service;

    public Dashboard(DataService service) {
        this.service = service;
        // Registers itself — service now holds a reference to this Dashboard
        service.addUpdateListener(this::onDataUpdate);
    }

    private void onDataUpdate(DataEvent event) { /* ... */ }

    // When this Dashboard is "closed", the listener is never removed.
    // service still holds a reference — Dashboard and everything it references cannot be GC'd.
}

The fix: implement AutoCloseable and remove the listener in close(). Or use weak references in the listener registry (covered below).

ThreadLocal variables not removed

ThreadLocal stores values per-thread. In application servers and thread pools, threads are reused — a ThreadLocal set during one request survives to the next request on the same thread if it's never removed. In server environments with long-lived thread pools, this is a reliable source of leaks:

public class RequestContext {
    private static final ThreadLocal<UserSession> SESSION = new ThreadLocal<>();

    public static void setSession(UserSession session) {
        SESSION.set(session);
    }

    public static UserSession getSession() {
        return SESSION.get();
    }

    // Missing: SESSION.remove() after the request completes
}

Every UserSession set during a request stays in memory attached to the thread. Over time, each thread in the pool accumulates one UserSession per request that was processed on it — the most recent one. Not a growing leak per se, but stale data from one request appearing in the next is both a leak and a correctness bug.

The fix: SESSION.remove() in a servlet filter's finally block or a Spring interceptor's afterCompletion:

try {
    SESSION.set(buildSession(request));
    chain.doFilter(request, response);
} finally {
    SESSION.remove(); // always, even on exception
}

Caches without eviction

Any in-memory cache that grows without bound is a managed leak. The most common form: a HashMap used as a cache with no removal policy:

public class UserCache {
    private final Map<Long, User> cache = new HashMap<>();

    public User get(long userId) {
        return cache.computeIfAbsent(userId, id -> userRepository.findById(id));
    }
    // No removal — every User ever loaded lives here forever
}

The fix: use a cache library with eviction semantics. Caffeine is the standard choice for Java in-process caches — it provides LRU/LFU eviction, time-based expiry, size bounds, and weak/soft reference support:

LoadingCache<Long, User> cache = Caffeine.newBuilder()
    .maximumSize(10_000)
    .expireAfterWrite(10, TimeUnit.MINUTES)
    .build(userId -> userRepository.findById(userId));

Interned strings

String.intern() moves a string into the JVM's string pool, which in Java 7+ is on the heap but managed separately from regular allocations. Interning user-supplied strings — API keys, user-agent strings, arbitrary input — fills the string pool with unique values that are never GC'd until the class loader that loaded the code is unloaded (which is essentially never for application code):

// Dangerous — interns user-supplied string into the permanent string pool
String normalized = userInput.intern();

Reserve intern() for a small, bounded set of known strings. For deduplication of a large but bounded set of strings, use a WeakHashMap<String, WeakReference<String>> or simply don't intern — modern JVMs deduplicate strings in the background with G1's string deduplication (-XX:+UseStringDeduplication).

Inner class references to outer class

Non-static inner classes hold an implicit reference to their enclosing outer class instance. If the inner class outlives the outer class — passed to an executor, registered as a listener, serialized — the outer class cannot be GC'd:

public class OrderProcessor {
    private final List<Order> pendingOrders; // large list

    public Runnable createProcessingTask() {
        // This anonymous Runnable holds an implicit reference to OrderProcessor
        return new Runnable() {
            @Override
            public void run() {
                // Even if we only use one thing from OrderProcessor,
                // the entire OrderProcessor is retained
                processPending();
            }
        };
    }
}

The fix: make the inner class static and pass only what it needs explicitly:

private static class ProcessingTask implements Runnable {
    private final List<Order> orders;

    ProcessingTask(List<Order> orders) {
        this.orders = orders;
    }

    @Override
    public void run() {
        processOrders(orders);
    }
}

Or use a lambda that captures only the specific field it needs rather than this:

public Runnable createProcessingTask() {
    List<Order> orders = this.pendingOrders; // capture the field, not this
    return () -> processOrders(orders);
}

Weak and soft references — intentional temporary retention

WeakReference<T> and SoftReference<T> let you hold a reference that the GC can clear. Useful for caches and listener registries where you want the reference to not prevent collection:

// Weak reference — collected at next GC if no strong references exist
WeakReference<ExpensiveObject> ref = new WeakReference<>(new ExpensiveObject());
ExpensiveObject obj = ref.get(); // null if GC has collected it

// WeakHashMap — entries are removed when keys are GC'd
WeakHashMap<Widget, WidgetMetadata> widgetMeta = new WeakHashMap<>();

WeakHashMap is useful for associating metadata with objects you don't own — when the object is collected, the entry disappears automatically. It's not a general-purpose cache replacement — it doesn't evict based on size or time, only on GC pressure.

SoftReference is cleared only when the JVM is under memory pressure. This makes it suitable for memory-sensitive caches: keep entries while memory is available, release them before throwing OutOfMemoryError. In practice, Caffeine's soft value support is easier to use and better behaved than managing SoftReference directly.

Finding leaks with heap dumps

The diagnostic process for a suspected memory leak:

Step 1: Confirm the leak. Monitor heap usage over time with JMX, Micrometer, or your APM. A genuine leak shows steady growth that doesn't return to baseline after GC cycles. Heap that grows and then drops after major GC is high allocation rate, not a leak.

Step 2: Take a heap dump. At the point of suspected maximum leak, before the process OOMs:

jmap -dump:format=b,file=heap.hprof <pid>
# Or trigger from code:
# HotSpotDiagnosticMXBean bean = ManagementFactory.newPlatformMXBeanProxy(...)
# bean.dumpHeap("heap.hprof", true);

Configure the JVM to dump automatically on OOM:

-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/dumps/

Step 3: Analyze with Eclipse Memory Analyzer (MAT). MAT's "Leak Suspects" report identifies objects with unexpectedly high retained heap. The retained heap of an object is the total heap that would be freed if that object were collected — a useful proxy for "what is this object holding onto."

The key view: the dominator tree. An object A dominates object B if every path from GC roots to B passes through A. Objects high in the dominator tree with large retained heap are leak candidates.

Step 4: Trace the reference chain. MAT's "Path to GC Roots" shows the reference chain keeping a suspected leak object alive. Follow it to the static field, ThreadLocal, or long-lived collection that's holding the reference.

VisualVM (free, bundled with JDK) provides a lighter-weight heap analysis. For production systems where taking a heap dump is disruptive, async-profiler's heap profiling mode samples allocations without a full dump.

The operational check that catches leaks early

Heap dumps are for diagnosis. The early warning is a memory usage metric that grows between GC cycles:

# Prometheus metric via Micrometer
jvm.memory.used{area="heap"} — should return to a stable baseline after major GC
jvm.memory.used{area="nonheap"} — Metaspace; should plateau after application startup

Set an alert on heap usage that doesn't return to baseline within N minutes of a full GC. That pattern — heap grows, full GC runs, heap growth resumes from a higher floor — is the signature of a leak. Catching it when the retained heap is 500MB is easier than diagnosing it after the process has been running for three days and is holding 8GB of unreachable objects.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Why the Nordics Are the Best Region to Work With an Async Backend Contractor

Your team already writes specs before building. Your standups are fifteen minutes. Your Confluence pages actually get updated. You might not realize it, but you're already set up for async contracting better than most companies in Europe.

Read more

Async Is Not a Compromise — It Is How the Best Remote Backend Teams Actually Work

Async remote work has a reputation as the fallback option when synchronous isn't possible. That reputation is wrong, and the teams doing backend development best know it.

Read more

Chicago Has a Thriving Tech Scene — and a Fintech Sector That Absorbs All the Senior Backend Talent

Chicago's tech community is active and growing. Its fintech and trading infrastructure sector quietly employs most of the senior backend engineers that community depends on.

Read more

Why Tallinn's Digital-First Startups Are the Most Natural Fit for Async Remote Backend Contractors

Estonia built its entire national infrastructure on the assumption that digital-first is just how things work. Its startups carry that assumption into how they operate — and it makes async contracting a natural fit.

Read more