String Interning, the String Pool, and Memory in Java — What Actually Happens

by Eric Hanson, Backend Developer at Clean Systems Consulting

The three ways a String ends up in memory

Not all String objects are the same in Java. Where a string lives in memory and whether it shares identity with another string of the same content depends on how it was created.

String literals — any string written directly in source code — are placed in the string pool (also called the string constant pool) at class load time. The pool is deduplicated: two class files containing the literal "pending" reference the same String object in the pool, not two separate objects.

new String(...) — explicitly allocating a string — always creates a new object on the heap, separate from the pool, even if an identical string already exists in the pool.

String.intern() — returns the pooled version of a string, adding it to the pool if not already present.

String a = "hello";           // pool
String b = "hello";           // same pool entry as a
String c = new String("hello"); // new heap object, not the pool entry
String d = c.intern();        // returns the pool entry — same as a and b

System.out.println(a == b);   // true  — same pool object
System.out.println(a == c);   // false — c is a separate heap object
System.out.println(a == d);   // true  — d is the pool entry
System.out.println(a.equals(c)); // true — content is the same

This is why == for string comparison is a bug — two strings with identical content may or may not be the same object depending on how they were created. equals() always compares content. == compares identity.

Where the pool lives

Before Java 7, the string pool was in PermGen — a fixed-size memory region separate from the heap. This made aggressive interning dangerous: fill PermGen with interned strings and you get OutOfMemoryError: PermGen space.

Since Java 7, the string pool is on the heap. This means:

  • Pool strings are subject to GC (though in practice, strings interned from literals are reachable through class metadata and rarely collected)
  • The pool can grow as large as the heap allows
  • -XX:StringTableSize controls the number of buckets in the pool's hash table (default 65536 in Java 11+, tunable for large-scale interning)

The pool is implemented as a hash table keyed by string content. intern() performs a lookup — O(1) average — and either returns the existing entry or inserts the new one.

How the JIT and javac interact with literals

The Java compiler performs compile-time string concatenation of literals. Constant string expressions are folded at compile time, not runtime:

String s1 = "hello" + " " + "world"; // compile-time: single literal "hello world"
String s2 = "hello world";

System.out.println(s1 == s2); // true — both reference the same pool entry

The compiler folds the concatenation into a single literal. Both s1 and s2 reference the same pool entry. If any operand is a variable (not a compile-time constant), folding doesn't apply:

String prefix = "hello";
String s3 = prefix + " world"; // runtime concatenation — new heap object

System.out.println(s2 == s3); // false — s3 is a separate heap object

final variables that are compile-time constants are treated as literals:

final String PREFIX = "hello";
String s4 = PREFIX + " world"; // compile-time constant — folded to "hello world"

System.out.println(s2 == s4); // true

This is a subtle distinction: final variables that are initialized with non-constant expressions — final String timestamp = LocalDateTime.now().toString() — are not compile-time constants and do not participate in constant folding.

String concatenation and allocation

The + operator on strings compiles to StringBuilder operations in modern Java (via invokedynamic since Java 9, StringConcatFactory). Each concatenation expression creates a new String object on the heap — not in the pool — along with the intermediate StringBuilder:

String result = "Order " + orderId + " status: " + status;
// Roughly equivalent to:
// new StringBuilder().append("Order ").append(orderId)
//     .append(" status: ").append(status).toString()

In a hot path called millions of times, this allocates two objects per call (the StringBuilder and the result String). For logging — where the string may not even be used if the log level is off — this allocation happens before the level check:

// Allocates the string even if DEBUG is disabled
logger.debug("Processing order " + orderId + " for user " + userId);

// No allocation if DEBUG is disabled — lambda is only evaluated if needed
logger.debug("Processing order {} for user {}", orderId, userId);
// Or with a supplier:
logger.debug(() -> "Processing order " + orderId + " for user " + userId);

SLF4J's parameterized logging ({} placeholders) defers string construction to after the level check. This is not a minor optimization in high-throughput services — logging at DEBUG in a method called 100,000 times per second creates 200,000 objects per second if the string is always constructed.

intern() — when it helps and when it backfires

intern() is appropriate when you have a large number of objects holding the same string values, and equality checks are frequent and performance-sensitive. The canonical case: a field that holds one of a small set of known values — status codes, category names, currency codes.

// Without interning — each deserialized record creates a new String
record.setStatus(jsonNode.get("status").asText()); // "pending", "shipped", etc.

// With interning — all records with status "pending" share one object
record.setStatus(jsonNode.get("status").asText().intern());

// Equality check becomes identity check
if (record.getStatus() == "pending") { ... } // valid after interning

The memory saving: 10 million records each holding a separate "pending" string consumes 10 million String objects (~240MB on a 64-bit JVM with compressed oops). With interning, they all reference one object.

The identity-check optimization is real but dangerous as a practice — it works only if you can guarantee all strings in the comparison have been interned, which requires discipline across the entire codebase. Miss one new String(...) and == silently returns false. equals() is always safer.

The risk: interning high-cardinality strings — user IDs, session tokens, request IDs — fills the pool with unique values that are never GC'd (pool entries backed by class metadata remain reachable). This is the String.intern() memory leak described in the memory leaks article.

The rule: intern strings only if the cardinality is bounded and small. Status codes, ISO currency codes, HTTP method names — these are safe to intern. Arbitrary user input, request identifiers, URLs — these are not.

G1 string deduplication — automatic without interning

G1GC has a background string deduplication feature that identifies String objects with identical content and replaces their backing char[] (or byte[] since Java 9's compact strings) with a shared reference — without changing the String object's identity or moving it to the pool:

-XX:+UseStringDeduplication  # requires -XX:+UseG1GC (default since Java 9)

String deduplication runs as part of the concurrent GC cycle. It identifies duplicate backing arrays and makes them reference the same underlying data. The String objects remain separate heap objects — == is still false — but they share backing storage.

This reduces heap usage for applications with many duplicate strings without the risks of intern(). The tradeoff: deduplication runs on the GC thread and has a small throughput cost. For applications with high string duplication (log processing, data pipelines, applications that parse the same field values repeatedly), the memory savings typically outweigh the cost.

Monitor with:

-XX:+PrintStringDeduplicationStatistics

This logs how many strings were deduplicated and how much space was reclaimed.

The equals() contract and pool assumptions

One final trap: code that assumes pool membership and uses == breaks when strings arrive from outside the pool:

// Brittle — works only if status was interned or is a literal
if (order.getStatus() == "PENDING") { ... }

// This works regardless of how status was created
if ("PENDING".equals(order.getStatus())) { ... }
// Putting the literal first also handles null safely — no NullPointerException

The equals() method is defined on content, not identity. It works correctly regardless of whether either string was interned, created with new, deserialized from JSON, read from a database, or produced by concatenation. == works correctly only for strings you can guarantee are pool entries — which in practice means only literal comparisons, and even then only within the same class loader.

The practical takeaway: use equals() for string comparison in all application code. Use intern() only for deliberate memory optimization on bounded-cardinality strings, with awareness of the pool growth risk. Let G1's deduplication handle the rest if memory pressure from duplicate strings is a measured problem.

Scale Your Backend - Need an Experienced Backend Developer?

We provide backend engineers who join your team as contractors to help build, improve, and scale your backend systems.

We focus on clean backend design, clear documentation, and systems that remain reliable as products grow. Our goal is to strengthen your team and deliver backend systems that are easy to operate and maintain.

We work from our own development environments and support teams across US, EU, and APAC timezones. Our workflow emphasizes documentation and asynchronous collaboration to keep development efficient and focused.

  • Production Backend Experience. Experience building and maintaining backend systems, APIs, and databases used in production.
  • Scalable Architecture. Design backend systems that stay reliable as your product and traffic grow.
  • Contractor Friendly. Flexible engagement for short projects, long-term support, or extra help during releases.
  • Focus on Backend Reliability. Improve API performance, database stability, and overall backend reliability.
  • Documentation-Driven Development. Development guided by clear documentation so teams stay aligned and work efficiently.
  • Domain-Driven Design. Design backend systems around real business processes and product needs.

Tell us about your project

Our offices

  • Copenhagen
    1 Carlsberg Gate
    1260, København, Denmark
  • Magelang
    12 Jalan Bligo
    56485, Magelang, Indonesia

More articles

Norway's Oil and Finance Sectors Poach Every Senior Backend Developer — How Startups Compete

Your senior backend engineer just left for Equinor. The one before him went to DNB. You can't match their offers, and they know it.

Read more

Why New York Fintech Startups Are Quietly Outsourcing Backend Work to Async Contractors

Your compliance team is growing faster than your engineering team. And somehow you're still behind on the payments integration that was supposed to ship last quarter.

Read more

Seattle Has Amazon and Microsoft. Everyone Else Competes for the Same Engineers — or Goes Remote

You found a backend engineer who loved your product, aced the technical screen, and seemed genuinely excited. Then Amazon matched with a $50K signing bonus.

Read more

Turning Your First Project Failure Into a Success Story

That moment when everything falls apart—missed deadlines, bugs everywhere, unhappy client. It feels like the end, but it’s actually the beginning of something useful.

Read more