How the JVM Manages Memory — Heap Regions, GC Algorithms, and What to Tune

February 24, 2026

by Arif Ikhsanudin, Backend Developer

The heap is not one region

Most developers think of the Java heap as a single pool of memory. It's not. The JVM divides the heap into regions, and different collectors structure these regions differently. Understanding the structure explains why GC behaves the way it does and which knobs actually matter.

The foundational insight behind all generational collectors: most objects die young. A String built to format a log line, a List assembled to pass to a method, a DTO created per request — these objects are unreachable within milliseconds of creation. Allocating and collecting them cheaply is the central problem the generational hypothesis solves.

Generational heap layout

The HotSpot JVM (OpenJDK, Oracle JDK) divides the heap into generations for collectors like G1, ParallelGC, and ZGC's predecessor CMS:

Young generation — where new objects are allocated. Divided into Eden (where fresh allocations go) and two Survivor spaces (S0, S1). A minor GC collects only the young generation. Surviving objects are promoted to the old generation after several collections.

Old generation — where long-lived objects reside. Objects are promoted here when they survive enough minor GCs (threshold controlled by -XX:MaxTenuringThreshold, default varies by collector). Old generation collection is a major GC — more expensive, less frequent.

Metaspace — class metadata, method bytecode, JIT-compiled native code. Not part of the heap proper. Controlled by -XX:MaxMetaspaceSize. Before Java 8, this was PermGen (fixed size, common source of OutOfMemoryError: PermGen space). Metaspace grows dynamically until the system limit.

The ratio of young to old generation size matters significantly. Too small a young generation means objects promote too early, filling the old generation with short-lived objects that then trigger expensive major GCs. Too large and minor GCs take longer. For most applications, the default young/total ratio (25% young) is a reasonable starting point; workloads with very high allocation rates benefit from a larger young generation.

The major collectors and what they optimize for

G1GC (Garbage First) — default since Java 9. Divides the heap into equal-sized regions (1–32MB each, depending on heap size) rather than fixed young/old areas. Regions are dynamically assigned as young, old, or humongous (for objects larger than 50% of a region). G1 prioritizes regions with the most garbage first — hence the name. Targets a configurable pause time goal (-XX:MaxGCPauseMillis, default 200ms).

G1 is the right default for most server applications. It balances throughput and latency and handles heap sizes from 4GB to hundreds of gigabytes reasonably well.

ZGC — available since Java 11, production-ready since Java 15. Concurrent collector — almost all GC work happens while the application runs, using load barriers to handle object references during concurrent compaction. Sub-millisecond pause times regardless of heap size. Trades some throughput for very low latency.

Use ZGC when pause time matters more than throughput — latency-sensitive APIs, real-time processing, large heaps where G1's pause times become unpredictable.

Shenandoah — similar goals to ZGC, different implementation. Available in OpenJDK builds. Also concurrent with very low pause times.

ParallelGC — throughput-optimized. Uses multiple threads for both minor and major GC but stops the world completely during collection. Higher throughput than G1 for batch workloads where pauses don't matter. Not suitable for latency-sensitive services.

SerialGC — single-threaded, stop-the-world. For small heaps and single-CPU environments. Default on some container environments with limited CPU resources — worth explicitly overriding.

What triggers GC and what it costs

Minor GC triggers when Eden is full. Cost is proportional to the number of live objects in the young generation (surviving objects must be copied to a survivor space or promoted). Dead objects cost nothing — they're simply abandoned when the region is swept. This is why allocation rate matters more than object count for young GC frequency.

Major GC (or mixed GC in G1) triggers when the old generation fills up, or when G1's concurrent marking cycle completes and identifies enough old-generation garbage to collect. Major GC cost is proportional to live objects in the old generation plus heap fragmentation.

Full GC — the expensive one. Collects the entire heap, including Metaspace. In G1, a full GC is a fallback for promotion failure (old generation can't accept promoted objects) or concurrent cycle failure. A full GC in a G1 application is a symptom of misconfiguration or insufficient heap, not normal operation. If your logs show [Full GC...] regularly, something is wrong.

Reading GC logs

Enable GC logging before you need it:

-Xlog:gc*:file=/var/log/app/gc.log:time,uptime,level,tags:filecount=10,filesize=20m

This logs all GC events with timestamps to a rotating file. The key events to look for:

[0.532s][info][gc] GC(3) Pause Young (Normal) (G1 Evacuation Pause) 512M->128M(2048M) 12.345ms
[4.217s][info][gc] GC(12) Pause Young (Concurrent Start) 1024M->256M(2048M) 18.891ms
[4.218s][info][gc] GC(12) Concurrent Mark Cycle
[5.103s][info][gc] GC(12) Pause Remark 1100M->1100M(2048M) 4.221ms
[5.890s][info][gc] GC(12) Pause Cleanup 1100M->256M(2048M) 1.102ms

The format: before->after(heap size) pause duration. Pause Young events are minor GCs. Concurrent Start begins G1's concurrent marking cycle. Remark and Cleanup are short stop-the-world phases within the concurrent cycle.

GCEasy (web tool) and JVM GC log analyzers parse these logs into throughput percentages, pause time distributions, and allocation rate charts — significantly more useful than reading raw log lines.

The tuning flags that actually matter

Heap size — set -Xms equal to -Xmx. Different values mean the JVM resizes the heap dynamically, which triggers GC and wastes time. Set them equal to pre-allocate the full heap at startup:

-Xms4g -Xmx4g

Pause time target for G1:

-XX:MaxGCPauseMillis=100

Lower values mean G1 collects smaller regions more frequently. Achievable pause times depend on allocation rate and heap size — setting this to 1ms on a heap with high allocation rate just means G1 misses the target, not that it achieves it.

G1 region size — usually auto-configured, but for heaps with many large objects:

-XX:G1HeapRegionSize=16m

Objects larger than half a region become "humongous" and are allocated directly in the old generation, bypassing the young generation entirely. Frequent humongous allocations (large byte arrays, large collections allocated per request) are a significant source of G1 problems. Increase region size or redesign the allocation.

Selecting the collector:

-XX:+UseG1GC          # default since Java 9
-XX:+UseZGC           # low latency, Java 15+
-XX:+UseShenandoahGC  # low latency alternative
-XX:+UseParallelGC    # throughput, batch workloads

GC thread count:

-XX:ParallelGCThreads=8      # stop-the-world GC threads
-XX:ConcGCThreads=4          # concurrent GC threads (G1, ZGC)

Defaults are based on CPU count. In containers with limited CPU, the JVM may see the host CPU count rather than the container limit — fix with -XX:ActiveProcessorCount=N or use Java 11+ which reads container CPU limits from cgroups.

Allocation pressure — what to fix in code before tuning flags

Flags tune the collector. Reducing allocation pressure reduces how often the collector runs. The two allocation patterns that cause the most GC pressure in application code:

Boxing in streams and collections. Stream<Integer> boxes every int. HashMap<String, Integer> boxes every value. In hot paths, use IntStream, primitive arrays, or IntIntHashMap (Eclipse Collections, Koloboke) to avoid the per-element boxing allocation.

String concatenation in loops. String result = "" followed by result += item in a loop allocates a new String per iteration. Use StringBuilder explicitly or String.join / Collectors.joining in streams.

Short-lived large objects. A byte[] allocated per request to buffer a response is a humongous object in G1 if it exceeds the region size threshold. Pool or reuse buffers for I/O-heavy services.

Profile before tuning. async-profiler with allocation profiling enabled (-e alloc) shows which call sites are responsible for the most byte allocation in production. Fix the top allocators before adjusting GC flags — a 10x reduction in allocation rate is worth more than any GC tuning.

The container trap

The JVM's default heap sizing (-Xmx defaults to 25% of physical RAM) is based on physical memory. In a container with 2GB of memory, the JVM takes 512MB by default — usually too little, leaving most of the container's memory unused while the application GCs constantly.

Explicitly set heap size for containerized applications. A common starting point: 75% of container memory for the heap, leaving room for Metaspace, thread stacks, native memory, and OS buffers:

-Xms1536m -Xmx1536m   # for a 2GB container

Java 11+ respects container memory limits from cgroups when running under Linux containers. Java 8u191+ added container awareness with -XX:+UseContainerSupport (enabled by default). Verify with -XshowSettings:all that the JVM sees the expected heap and CPU counts.

Our offices

Follow us

How the JVM Manages Memory — Heap Regions, GC Algorithms, and What to Tune

The heap is not one region

Generational heap layout

The major collectors and what they optimize for

What triggers GC and what it costs

Reading GC logs

The tuning flags that actually matter

Allocation pressure — what to fix in code before tuning flags

The container trap

Scale Your Backend - Need an Experienced Backend Developer?

Tell us about your project

Our offices

More articles

How to Build Trust With a Client You Have Never Met in Person

The Hidden Complexity of Backend Systems

The Most Common Warning Signs in Failing Software Projects

Message Queues: The Part of System Design Most Backends Skip Too Long