Take every nth element from a Java 8 stream

One of the prime motivations for the introduction of Java streams was to allow parallel operations. This led to a requirement that operations on Java streams such as map and filter be independent of the position of the item in the stream or the items around it. This has the advantage of making it easy to split streams for parallel processing. It has the disadvantage of making certain operations more complex.

So the simple answer is that there is no easy way to do things such as take every nth item or map each item to the sum of all previous items.

The most straightforward way to implement your requirement is to use the index of the list you are streaming from:

List<String> list = ...;
return IntStream.range(0, list.size())
    .filter(n -> n % 3 == 0)
    .mapToObj(list::get)
    .toList();

A more complicated solution would be to create a custom collector that collects every nth item into a list.

class EveryNth<C> {
    private final int nth;
    private final List<List<C>> lists = new ArrayList<>();
    private int next = 0;

    private EveryNth(int nth) {
        this.nth = nth;
        IntStream.range(0, nth).forEach(i -> lists.add(new ArrayList<>()));
    }

    private void accept(C item) {
        lists.get(next++ % nth).add(item);
    }

    private EveryNth<C> combine(EveryNth<C> other) {
        other.lists.forEach(l -> lists.get(next++ % nth).addAll(l));
        next += other.next;
        return this;
    }

    private List<C> getResult() {
        return lists.get(0);
    }

    public static Collector<Integer, ?, List<Integer>> collector(int nth) {
        return Collector.of(() -> new EveryNth(nth), 
            EveryNth::accept, EveryNth::combine, EveryNth::getResult));
}

This could be used as follows:

Stream.of("Anne", "Bill", "Chris", "Dean", "Eve", "Fred", "George")
    .parallel().collect(EveryNth.collector(3)).toList();

Which returns the result ["Anne", "Dean", "George"] as you would expect.

This is a very inefficient algorithm even with parallel processing. It splits all items it accepts into n lists and then just returns the first. Unfortunately it has to keep all items through the accumulation process because it’s not until they are combined that it knows which list is the nth one.

Given the complexity and inefficiency of the collector solution I would definitely recommend sticking with the indices based solution above in preference to this if you can. If you aren’t using a collection that supports get (e.g. you are passed a Stream rather than a List) then you will either need to collect the stream using Collectors.toList or use the EveryNth solution above.

Leave a Comment