Caffe: What can I do if only a small batch fits into memory?

You can change the iter_size in the solver parameters.
Caffe accumulates gradients over iter_size x batch_size instances in each stochastic gradient descent step.
So increasing iter_size can also get more stable gradient when you cannot use large batch_size due to the limited memory.

Leave a Comment