@BatchSize a smart or stupid use?

  1. Yes, @BatchSize is meant to be used with lazy associations.
  2. Hibernate will execute multiple statements in most sitations anyway, even if the count of uninitialized proxies/collections is less than the specified batch size. See this answer for more details. Also, more lighter queries compared to less bigger ones may positively contribute to the overall throughput of the system.
  3. @BatchSize on class level means that the specified batch size for the entity will be applied for all @*ToOne lazy associations with that entity. See the example with the Person entity in the documentation.

The linked question/answers you provided are more concerned about the need for optimization and lazy loading in general. They apply here as well of course, but they are not related to batch loading only, which is just one of the possible approaches.

Another important thing relates to eager loading which is mentioned in the linked answers and which suggests that if a property is always used then you may get better performance by using eager loading. This is in general not true for collections and in many situations for to-one associations either.

For example, suppose you have the following entity for which bs and cs are always used when A is used.

public class A {
  @OneToMany
  private Collection<B> bs;

  @OneToMany
  private Collection<C> cs;
}

Eagerly loading bs and cs obviously suffers from N+1 selects problem if you don’t join them in a single query. But if you join them in a single query, for example like:

select a from A
  left join fetch a.bs
  left join fetch a.cs

then you create full Cartesian product between bs and cs and returning count(a.bs) x count(a.cs) rows in the result set for each a which are read one by one and assembled into the entities of A and their collections of bs and cs.

Batch fetching would be very optimal in this situation, because you would first read As, then bs and then cs, resulting in more queries but with much less total amount of data that is transferred from the database. Also, the separate queries are much simpler than a big one with joins and are easier for database to execute and optimize.

Leave a Comment