read matlab v7.3 file into python list of numpy arrays via h5py

Well I found the solution to my problem. If anyone else has a better solution or can better explain I’d still like to hear it.

Basically, the <HDF5 object reference> needed to be used to index the h5py file object to get the underlying array that is being referenced. After we are referring to the array that is needed, it has to be loaded to memory by indexing it with [:] or any subset if only part of the array is required. Here is what I mean:

with h5py.File("f.mat") as f:
    data = [f[element[0]][:] for element in f['rank']]

and the result:

In [79]: data[0].shape
Out[79]: (50L, 53L)

In [80]: data[0].dtype
Out[80]: dtype('float64')

Hope this helps anyone in the future. I think this is the most general solution I’ve seen so far.

Leave a Comment