Sorting in Java — Comparators, Natural Ordering, and Where Performance Actually Comes From
by Eric Hanson, Backend Developer at Clean Systems Consulting
TimSort — what Java's sort actually does
Arrays.sort and Collections.sort use TimSort for object arrays and lists since Java 7. TimSort is a hybrid of merge sort and insertion sort that exploits natural runs — sequences of already-sorted elements — in the input. On partially sorted data (common in practice: appended records, nearly-sorted lists, recently modified datasets), TimSort approaches O(n). Worst case is O(n log n).
For primitive arrays (int[], long[], double[]), Arrays.sort uses a dual-pivot quicksort — faster in practice for primitives because there are no object references, no comparator dispatch, and the data is cache-friendly. If you have a collection of boxed integers that you're sorting, converting to int[] first and sorting the primitive array is measurably faster for large arrays.
The practical implication: Java's sort is already optimized. The bottleneck in most slow sorts is the comparator, not the algorithm.
Comparable — natural ordering
Comparable<T> defines a class's natural ordering — the default sort order the class itself declares. Implementing Comparable means the class has an inherent concept of "less than" and "greater than":
public class Order implements Comparable<Order> {
private final long id;
private final Instant createdAt;
private final long total;
@Override
public int compareTo(Order other) {
return Long.compare(this.id, other.id); // natural order: ascending by ID
}
}
Long.compare(a, b) — not a - b. The subtraction idiom is a classic bug: when a is a large positive value and b is a large negative value (or vice versa), the subtraction overflows and returns the wrong sign. Always use Type.compare() for numeric comparisons in comparators.
The Comparable contract requires three properties:
- Antisymmetry:
sgn(x.compareTo(y)) == -sgn(y.compareTo(x)) - Transitivity: if
x.compareTo(y) > 0andy.compareTo(z) > 0, thenx.compareTo(z) > 0 - Consistency with equals (strongly recommended):
x.compareTo(y) == 0should implyx.equals(y)
Violating consistency with equals is legal but confusing — objects that compareTo returns 0 for will behave as equal in TreeMap and TreeSet (which use compareTo for equality) but as unequal in HashMap and HashSet (which use equals). The same object in both a TreeSet and HashSet may exhibit contradictory membership behavior.
BigDecimal is the standard library example of this inconsistency: new BigDecimal("1.0").compareTo(new BigDecimal("1.00")) returns 0, but new BigDecimal("1.0").equals(new BigDecimal("1.00")) returns false. TreeSet<BigDecimal> treats them as duplicates; HashSet<BigDecimal> does not.
Comparator — external ordering
Comparator<T> defines ordering externally — without modifying the class being ordered. Use it when:
- The class doesn't implement
Comparable - You need a different ordering than the natural one
- You need multiple orderings for the same class
The modern Comparator API (Java 8+) composes cleanly:
// Single-field ascending
Comparator<Order> byTotal = Comparator.comparingLong(Order::getTotal);
// Single-field descending
Comparator<Order> byTotalDesc = Comparator.comparingLong(Order::getTotal).reversed();
// Multi-field: first by status, then by total descending, then by ID
Comparator<Order> complex = Comparator
.comparing(Order::getStatus)
.thenComparingLong(Order::getTotal).reversed()
.thenComparingLong(Order::getId);
Wait — the .reversed() call above applies to the entire chain up to that point, not just thenComparingLong. This is a common mistake. reversed() inverts the entire comparator it's called on. To reverse only one level of a multi-field sort, build that comparator separately:
// Correct: status ascending, total descending, ID ascending
Comparator<Order> byStatus = Comparator.comparing(Order::getStatus);
Comparator<Order> byTotalDesc = Comparator.comparingLong(Order::getTotal).reversed();
Comparator<Order> byId = Comparator.comparingLong(Order::getId);
Comparator<Order> combined = byStatus.thenComparing(byTotalDesc).thenComparing(byId);
Or use Comparator.comparingLong(..., Comparator.reverseOrder()) for descending within a chain:
// All in one chain, correctly reversed only for total
Comparator<Order> combined = Comparator
.comparing(Order::getStatus)
.thenComparing(Order::getTotal, Comparator.reverseOrder())
.thenComparingLong(Order::getId);
Null handling
Standard comparators throw NullPointerException when encountering null values. Comparator.nullsFirst() and Comparator.nullsLast() wrap a comparator to handle nulls explicitly:
// Sort by optional field — nulls last
Comparator<Order> byShippedAt = Comparator.comparing(
Order::getShippedAt,
Comparator.nullsLast(Comparator.naturalOrder())
);
// Sort by optional field — nulls first, then descending
Comparator<Order> byShippedAtDesc = Comparator.comparing(
Order::getShippedAt,
Comparator.nullsFirst(Comparator.reverseOrder())
);
nullsFirst and nullsLast wrap the inner comparator — null values sort before or after all non-null values, and non-null values are compared by the wrapped comparator.
For sorting nullable objects themselves (not fields of objects):
List<String> withNulls = Arrays.asList("b", null, "a", null, "c");
withNulls.sort(Comparator.nullsLast(Comparator.naturalOrder()));
// ["a", "b", "c", null, null]
Comparator performance — where the cost hides
For a sort of n elements, the comparator is called O(n log n) times. A comparator that allocates objects or does non-trivial work is called that many times.
Method reference vs lambda. Comparator.comparing(Order::getTotal) and Comparator.comparing(o -> o.getTotal()) produce comparable bytecode. The method reference is preferred for clarity, not performance.
String comparison is expensive at scale. String comparisons scan characters until a difference is found. Sorting 100,000 orders by a long string field (URL, description, JSON payload) can be slow because each comparison scans many characters. For sorting on strings that share long common prefixes, consider a key extraction that captures only the differentiating portion.
Schwartzian transform for expensive key extraction. When the key extraction is expensive (parsing, database lookup, regex match) and the same key is extracted multiple times during sorting, precompute the keys:
// Expensive key extracted O(n log n) times
orders.sort(Comparator.comparing(order -> parseComplexKey(order.getMetadata())));
// Extract once, sort on extracted values — O(n) extractions + O(n log n) int comparisons
List<Order> sorted = orders.stream()
.map(order -> Map.entry(parseComplexKey(order.getMetadata()), order))
.sorted(Map.Entry.comparingByKey())
.map(Map.Entry::getValue)
.collect(Collectors.toList());
Map.Entry.comparingByKey() sorts by the precomputed key. The extraction happens once per element, not once per comparison. For large collections with expensive key extraction, this can be the difference between seconds and milliseconds.
The comparator contract violation that silently corrupts sorted collections
A comparator that violates transitivity can produce unpredictable behavior in sorted structures — TreeMap, TreeSet, sort methods. The violation isn't always caught immediately; it can corrupt the structure silently:
// Violates transitivity — not a valid total ordering
Comparator<Order> broken = (a, b) -> {
if (a.getTotal() > 1000 && b.getTotal() < 500) return -1;
if (a.getTotal() < 500 && b.getTotal() > 1000) return 1;
return 0; // treats mid-range orders as equal to everything
};
This comparator says orders over 1000 come before orders under 500, but treats mid-range orders as equal to both. That's not transitive — if A (1200) < B (300) and B (300) = C (700), transitivity requires A < C, but the comparator says A > C (1200 > 700 is treated as 0 in the second arm, then the first arm doesn't fire because 700 is not < 500).
Java 8's List.sort and Arrays.sort detect some transitivity violations and throw IllegalArgumentException: Comparison method violates its general contract! — the error that appears with no obvious cause in production. The fix is always the comparator. Simplify to a total ordering.
Sorting stability
Java's Collections.sort and Arrays.sort for objects are stable — equal elements maintain their original relative order. This matters when sorting by one field and then by another: sort by date, then sort by status, and elements with the same status retain their date order. Stable sorts make multi-field sorting by repeated single-field sorts predictable.
The dual-pivot quicksort for primitive arrays is not guaranteed stable. If stability matters for primitive data, convert to boxed objects, sort with Arrays.sort(Object[]), and convert back — or accept the instability if order among equal elements doesn't matter.
The practical summary
Use Comparable when the class has a single obvious natural ordering and owns its definition. Use Comparator for everything else: external ordering, multiple orderings, nullable fields, composition from simpler comparators.
Build comparators with Comparator.comparing and thenComparing — not by hand. The composition API is readable, correct by construction (it handles the reversed() semantics you'll get wrong manually), and eliminates the subtraction overflow bug class entirely.
For large datasets with expensive key extraction, precompute keys. For the sort itself, trust TimSort — it's not the bottleneck.