Hibernate batch size confusion

When you call session.save(), hibernate will generate an INSERT SQL. This INSERT SQL will be appended to be issued to the DB during flushing (i.e session.flush()) .

During flushing, if hibernate.jdbc.batch_size is set to some non-zero value, Hibernate will use the batching feature introduced in the JDBC2 API to issue the batch insert SQL to the DB .

For example , if you save() 100 records and your hibernate.jdbc.batch_size is set to 50. During flushing, instead of issue the following SQL 100 times :

insert into TableA (id , fields) values (1, 'val1');
insert into TableA (id , fields) values (2, 'val2');
insert into TableA (id , fields) values (3, 'val3');
.........................
insert into TableA (id , fields) values (100, 'val100');

Hiberate will group them in batches of 50 , and only issue 2 SQL to the DB, like this:

insert into TableA (id , fields) values (1, 'val1') , (2, 'val2') ,(3, 'val3') ,(4, 'val4') ,......,(50, 'val50')
insert into TableA (id , fields) values (51, 'val51') , (52, 'val52') ,(53, 'val53') ,(54, 'val54'),...... ,(100, 'val100')  

Please note that Hibernate would disable insert batching at the JDBC level transparently if the primary key of the inserting table isGenerationType.Identity.

From your log: you save() only one record and then flush(), so there is only one appending INSERT SQL to be processed for every flush. That’s why Hibernate cannot help you to batch inserting as there is only one INSERT SQL to be processed. You should save() up to the certain amount of records before calling flush() instead of calling flush() for every save().

The best practise of batch inserting is something like this:

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();
for  ( int i=0; i<888888; i++ ) {
  TableA record = new TableA();
    record.setXXXX();
    session.save(record)
    if ( i % 50 == 0 ) { //50, same as the JDBC batch size
        //flush a batch of inserts and release memory:
        session.flush();
        session.clear();
    }
}
tx.commit();
session.close();

You save and flush the records batch by batch. In the end of each batch you should clear the persistence context to release some memory to prevent memory exhaustion as every persistent object is placed into the first level cache (your JVM’s memory). You could also disable the second-level cache to reduce the unnecessary overhead.


Reference:

Leave a Comment