Group by and sum objects like in SQL with Java lambdas?

Using Collectors.groupingBy is the right approach but instead of using the single argument version which will create a list of all items for each group you should use the two arg version which takes another Collector which determines how to aggregate the elements of each group.

This is especially smooth when you want to aggregate a single property of the elements or just count the number of elements per group:

  • Counting:

    list.stream()
      .collect(Collectors.groupingBy(foo -> foo.id, Collectors.counting()))
      .forEach((id,count)->System.out.println(id+"\t"+count));
    
  • Summing up one property:

    list.stream()
      .collect(Collectors.groupingBy(foo -> foo.id,
                                        Collectors.summingInt(foo->foo.targetCost)))
      .forEach((id,sumTargetCost)->System.out.println(id+"\t"+sumTargetCost));
    

In your case when you want to aggregate more than one property specifying a custom reduction operation like suggested in this answer is the right approach, however, you can perform the reduction right during the grouping operation so there is no need to collect the entire data into a Map<…,List> before performing the reduction:

(I assume you use a import static java.util.stream.Collectors.*; now…)

list.stream().collect(groupingBy(foo -> foo.id, collectingAndThen(reducing(
  (a,b)-> new Foo(a.id, a.ref, a.targetCost+b.targetCost, a.actualCost+b.actualCost)),
      Optional::get)))
  .forEach((id,foo)->System.out.println(foo));

For completeness, here a solution for a problem beyond the scope of your question: what if you want to GROUP BY multiple columns/properties?

The first thing which jumps into the programmers mind, is to use groupingBy to extract the properties of the stream’s elements and create/return a new key object. But this requires an appropriate holder class for the key properties (and Java has no general purpose Tuple class).

But there is an alternative. By using the three-arg form of groupingBy we can specify a supplier for the actual Map implementation which will determine the key equality. By using a sorted map with a comparator comparing multiple properties we get the desired behavior without the need for an additional class. We only have to take care not to use properties from the key instances our comparator ignored, as they will have just arbitrary values:

list.stream().collect(groupingBy(Function.identity(),
  ()->new TreeMap<>(
    // we are effectively grouping by [id, actualCost]
    Comparator.<Foo,Integer>comparing(foo->foo.id).thenComparing(foo->foo.actualCost)
  ), // and aggregating/ summing targetCost
  Collectors.summingInt(foo->foo.targetCost)))
.forEach((group,targetCostSum) ->
    // take the id and actualCost from the group and actualCost from aggregation
    System.out.println(group.id+"\t"+group.actualCost+"\t"+targetCostSum));

Leave a Comment