[caffe]: check fails: Check failed: hdf_blobs_[i]->shape(0) == num (200 vs. 6000)

The problem

It seems like there is indeed a conflict with the order of elements in arrays: matlab arranges the elements from the first dimension to the last (like fortran), while caffe and hdf5 stores the arrays from last dimension to first:
Suppose we have X of shape nxcxhxw then the “second element of X” is X[2,1,1,1] in matlab but X[0,0,0,1] in C (1-based vs 0-based indexing doesn’t make life easier at all).
Therefore, when you save an array of size=[200, 6000, 1, 1] in Matlab, what hdf5 and caffe are actually seeing is as array of shape=[6000,200].

Using the h5ls command line tool can help you spot the problem.
In matlab you saved

>> hdf5write('my_data.h5', '/new_train_x', 
  single( reshape(new_train_x,[200, 6000, 1, 1]) );
>> hdf5write('my_data.h5', '/label_train', 
  single( reshape(label_train,[200, 1, 1, 1]) ),
  'WriteMode', 'append' );

Now you can inspect the resulting my_data.h5 using h5ls (in Linux terminal):

user@host:~/$ h5ls ./my_data.h5
  label_train              Dataset {200}
  new_train_x              Dataset {6000, 200}

As you can see, the arrays are written “backwards”.

Solution

Taking this conflict into account when exporting data from Matlab, you should permute:

load data.mat
hdf5write('my_data.h5', '/new_train_x', 
  single( permute(reshape(new_train_x,[200, 6000, 1, 1]),[4:-1:1] ) );
hdf5write('my_data.h5', '/label_train', 
  single( permute(reshape(label_train,[200, 1, 1, 1]), [4:-1:1] ) ),
  'WriteMode', 'append' );

Inspect the resulting my_data.h5 using h5ls now results with:

user@host:~/$ h5ls ./my_data.h5
  label_train              Dataset {200, 1, 1, 1}
  new_train_x              Dataset {200, 6000, 1, 1}

Which is what you expected in the first place.

Leave a Comment