How to read a v7.3 mat file via h5py?

Matlab 7.3 file format is not extremely easy to work with h5py. It relies on HDF5 reference, cf. h5py documentation on references.

>>> import h5py
>>> f = h5py.File('test.mat')
>>> list(f.keys())
['#refs#', 'struArray']
>>> struArray = f['struArray']
>>> struArray['name'][0, 0]  # this is the HDF5 reference
<HDF5 object reference>
>>> f[struArray['name'][0, 0]].value  # this is the actual data
array([[111],
       [110],
       [101]], dtype=uint16)

To read struArray(i).id:

>>> f[struArray['id'][0, 0]][0, 0]
1.0
>>> f[struArray['id'][1, 0]][0, 0]
2.0
>>> f[struArray['id'][2, 0]][0, 0]
3.0

Notice that Matlab stores a number as an array of size (1, 1), hence the final [0, 0] to get the number.

To read struArray(i).data:

>>> f[struArray['data'][0, 0]].value
array([[  1.],
       [  2.],
       [  3.],
       [  4.],
       [  5.],
       [  6.],
       [  7.],
       [  8.],
       [  9.],
       [ 10.]])

To read struArray(i).name, it is necessary to convert the array of integers to string:

>>> f[struArray['name'][0, 0]].value.tobytes()[::2].decode()
'one'
>>> f[struArray['name'][1, 0]].value.tobytes()[::2].decode()
'two'
>>> f[struArray['name'][2, 0]].value.tobytes()[::2].decode()
'three'

Leave a Comment