Collectors, flatMap, and Reduce in Java Streams — The Operations That Take More Than a Minute to Learn

April 21, 2026

by Arif Ikhsanudin, Backend Developer

flatMap — the operation most developers reach for too late

map transforms each element into one element. flatMap transforms each element into zero or more elements — a stream — and flattens all those streams into one:

// map produces Stream<List<LineItem>> — a stream of lists
orders.stream()
      .map(Order::getLineItems)   // each order becomes a List<LineItem>

// flatMap produces Stream<LineItem> — a flat stream of all line items
orders.stream()
      .flatMap(order -> order.getLineItems().stream()) // each order contributes multiple elements
      .filter(item -> item.getQuantity() > 1)
      .collect(Collectors.toList());

The distinction: map is one-to-one. flatMap is one-to-many, then flattened. When the transformation produces a collection and you want to work with the elements of that collection, flatMap is correct. Using map produces nested streams that require unwrapping.

flatMap also handles optional flattening — a common pattern when filtering and transforming in one step:

// Map each ID to an Optional<Order>, then flatten to only present values
List<Order> found = orderIds.stream()
    .map(id -> orderRepository.findById(id)) // Stream<Optional<Order>>
    .flatMap(Optional::stream)               // Stream<Order> — only present values
    .collect(Collectors.toList());

Optional::stream (Java 9+) returns a stream of one element if present, empty stream if absent. flatMap with Optional::stream is the idiomatic way to filter and unwrap optionals in a single operation.

reduce — building a result from a stream

reduce is the general accumulation operation. It takes an identity value and an associative combining function, and folds all elements into a single result:

// Sum of order totals
long total = orders.stream()
    .mapToLong(Order::getTotal)
    .reduce(0L, Long::sum);

// String concatenation — illustrative only, use Collectors.joining in practice
String joined = Stream.of("a", "b", "c")
    .reduce("", (acc, s) -> acc + s); // "abc"

The identity value must be an identity for the combining function — 0 for addition, 1 for multiplication, "" for concatenation. If no elements are in the stream, the identity is returned.

The two-argument reduce returns T. The one-argument version (no identity) returns Optional<T> — empty if the stream is empty:

Optional<Order> mostExpensive = orders.stream()
    .reduce((a, b) -> a.getTotal() > b.getTotal() ? a : b);

When to use reduce vs specialized operations. For numeric reductions, use mapToInt/mapToLong/mapToDouble followed by sum(), average(), min(), max() — they're more readable and avoid boxing:

// Prefer this
long total = orders.stream().mapToLong(Order::getTotal).sum();

// Over this
long total = orders.stream().map(Order::getTotal).reduce(0L, Long::sum);

reduce earns its place for non-numeric accumulations where no specialized method exists — building a combined result type, finding an element by custom comparison, or folding into a mutable result when the combiner is associative.

The Collectors toolkit

Collectors is where streams become genuinely powerful for data processing. The operations beyond toList() and toSet():

groupingBy — partitioning into a Map

groupingBy groups elements by a classifier function, producing Map<K, List<V>>:

Map<String, List<Order>> byStatus = orders.stream()
    .collect(Collectors.groupingBy(Order::getStatus));
// { "pending" -> [...], "shipped" -> [...], "cancelled" -> [...] }

The downstream collector argument transforms the grouped lists:

// Count per status instead of list
Map<String, Long> countByStatus = orders.stream()
    .collect(Collectors.groupingBy(
        Order::getStatus,
        Collectors.counting()
    ));

// Sum of totals per status
Map<String, Long> totalByStatus = orders.stream()
    .collect(Collectors.groupingBy(
        Order::getStatus,
        Collectors.summingLong(Order::getTotal)
    ));

// Map to a different value type
Map<String, List<Long>> idsByStatus = orders.stream()
    .collect(Collectors.groupingBy(
        Order::getStatus,
        Collectors.mapping(Order::getId, Collectors.toList())
    ));

Multi-level grouping — group by status, then by customer:

Map<String, Map<Long, List<Order>>> byStatusThenCustomer = orders.stream()
    .collect(Collectors.groupingBy(
        Order::getStatus,
        Collectors.groupingBy(Order::getCustomerId)
    ));

partitioningBy — binary grouping

partitioningBy is a special case of groupingBy with a Predicate — always produces a map with two keys: true and false:

Map<Boolean, List<Order>> partition = orders.stream()
    .collect(Collectors.partitioningBy(order -> order.getTotal() > 10_000));

List<Order> largeOrders = partition.get(true);
List<Order> smallOrders = partition.get(false);

Slightly more efficient than groupingBy with a boolean classifier because the result map is always exactly two entries.

joining — string construction

Collectors.joining replaces StringBuilder loops for building strings from stream elements:

String csv = orders.stream()
    .map(Order::getId)
    .collect(Collectors.joining(", "));
// "ord-1, ord-2, ord-3"

String wrapped = orders.stream()
    .map(Order::getId)
    .collect(Collectors.joining(", ", "[", "]"));
// "[ord-1, ord-2, ord-3]"

toMap — explicit key and value extraction

toMap builds a Map with explicit key and value functions:

Map<Long, Order> ordersById = orders.stream()
    .collect(Collectors.toMap(Order::getId, Function.identity()));

The third argument handles duplicate keys — required if keys might collide:

// Keep the higher-value order on collision
Map<Long, Order> highValueByCustomer = orders.stream()
    .collect(Collectors.toMap(
        Order::getCustomerId,
        Function.identity(),
        (existing, replacement) ->
            existing.getTotal() > replacement.getTotal() ? existing : replacement
    ));

Without the merge function, duplicate keys throw IllegalStateException. This is intentional — toMap is strict about key uniqueness by default. The exception tells you that you have duplicate keys that require a decision; it's better than silently overwriting.

The fourth argument specifies the map implementation — useful when insertion order matters:

Map<String, Order> ordered = orders.stream()
    .collect(Collectors.toMap(
        Order::getId,
        Function.identity(),
        (a, b) -> a,
        LinkedHashMap::new  // maintains insertion order
    ));

Custom collectors — when the built-ins don't fit

Collector.of() builds a custom collector when Collectors doesn't have what you need:

// Collector that builds an ImmutableList (Guava)
Collector<Order, ImmutableList.Builder<Order>, ImmutableList<Order>> toImmutableList =
    Collector.of(
        ImmutableList::builder,              // supplier — creates the mutable accumulator
        ImmutableList.Builder::add,          // accumulator — adds an element
        (b1, b2) -> b1.addAll(b2.build()),  // combiner — merges two accumulators (parallel)
        ImmutableList.Builder::build         // finisher — converts accumulator to result
    );

ImmutableList<Order> immutable = orders.stream().collect(toImmutableList);

The four functions: supplier creates a new mutable container, accumulator adds one element to the container, combiner merges two containers (used in parallel streams — must be associative), finisher converts the container to the final result type.

A more practical custom collector — collecting into a statistics object:

record OrderStats(long count, long totalValue, long maxValue) {}

Collector<Order, long[], OrderStats> statsCollector = Collector.of(
    () -> new long[3],                          // [count, sum, max]
    (arr, order) -> {
        arr[0]++;
        arr[1] += order.getTotal();
        arr[2] = Math.max(arr[2], order.getTotal());
    },
    (a, b) -> new long[]{a[0]+b[0], a[1]+b[1], Math.max(a[2], b[2])},
    arr -> new OrderStats(arr[0], arr[1], arr[2])
);

OrderStats stats = orders.stream().collect(statsCollector);

This computes count, sum, and max in a single pass with a primitive array accumulator — no boxing, no multiple stream passes.

teeing — splitting a stream into two collectors

Collectors.teeing (Java 12+) processes a stream through two collectors simultaneously and merges the results:

record Summary(List<Order> large, long smallTotal) {}

Summary summary = orders.stream().collect(
    Collectors.teeing(
        Collectors.filtering(o -> o.getTotal() > 10_000, Collectors.toList()),
        Collectors.filtering(o -> o.getTotal() <= 10_000,
            Collectors.summingLong(Order::getTotal)),
        Summary::new
    )
);

teeing replaces two separate stream passes or a partitioningBy with downstream collection when the two halves produce different result types. It processes each element exactly once, feeding it to both downstream collectors.

The single-pass principle

The practical value of complex collectors — custom collectors, teeing, nested groupingBy — is computing multiple results in a single iteration over the data. Each additional stream pass over a large collection adds cost. When you find yourself writing stream().filter(x).count() followed by stream().filter(x).collect(toList()), that's two passes where one suffices:

// Two passes
long count     = orders.stream().filter(Order::isPending).count();
List<Order> pending = orders.stream().filter(Order::isPending).collect(toList());

// One pass with teeing
record PendingResult(long count, List<Order> orders) {}
PendingResult result = orders.stream()
    .filter(Order::isPending)
    .collect(Collectors.teeing(
        Collectors.counting(),
        Collectors.toList(),
        PendingResult::new
    ));

For in-memory collections this is a minor optimization. For streams backed by database queries, file reads, or network data, avoiding multiple passes is a correctness concern as much as a performance one.

Our offices

Follow us

Collectors, flatMap, and Reduce in Java Streams — The Operations That Take More Than a Minute to Learn

flatMap — the operation most developers reach for too late

reduce — building a result from a stream

The Collectors toolkit

groupingBy — partitioning into a Map

partitioningBy — binary grouping

joining — string construction

toMap — explicit key and value extraction

Custom collectors — when the built-ins don't fit

teeing — splitting a stream into two collectors

The single-pass principle

Scale Your Backend - Need an Experienced Backend Developer?

Tell us about your project

Our offices

More articles

Samsung, Kakao and Naver Hire Seoul's Best Backend Developers — Here Is What Startups Do

JWT in APIs: What It Does Well and Where It Falls Short

How to Say No to a Client Request Without Losing the Relationship

Load Balancing Is Not Just Distributing Traffic. Here Is What It Really Does.