What is the difference between the two ways of accessing the hdf5 group in SVHN dataset?

First, there is a minor difference in output from your 2 methods.
Method 1: returns the full array (of the encoded file name)
Method 2: only returns the first element (character) of the array

Let’s deconstruct your code to understand what you have.
The first part deals with h5py data objects.

f['digitStruct'] -> returns a h5py group object
f['digitStruct']['name'] -> returns a h5py dataset object
f['digitStruct']['name'].name -> returns the name (path) of the dataset object

Note:
The /digitStruct/name dataset contains “Object References”. Each array entry is a pointer to another h5py object (in this case another dataset).
For example (spaces used to delineate the 2 object references):
f[ f['digitStruct']['name'][0][0] ] -> returns the object referenced at [0][0]
So, the outer f[ obj_ref ] works just like other object references.

In the case of f['digitStruct']['name'][0][0], this is an object pointing to dataset /#refs#/b
In other words, f['digitStruct']['name'][0][0] references the same object as:
f['#refs#']['b'] or f['/#refs#/b']

So much for h5py object references.
Let’s continue to get the data from this object reference using Method 1.

f[f['digitStruct']['name'][0][0]].value -> returns the entire /#refs#/b dataset as a NumPy array.

However, dataset.value is deprecated, and NumPy indexing is preferred, like this:
f[f['digitStruct']['name'][0][0]][:] (to get the entire array)

Note: both of these return the entire array of encoded characters.
At this point, getting the name is Python and NumPy fuctionality.
Use this to return the filename as a string:
f[f['digitStruct']['name'][0][0]][:].tostring().decode('ascii')

Now let’s deconstruct the object reference you used for Method 2.

f['digitStruct']['name'].value
-> returns the entire /digitStruct/name dataset as a NumPy array.
It has 13,068 rows with object references

f['digitStruct']['name'].value[0] -> is the first row

f['digitStruct']['name'].value[0].item() -> copies that array element to a python scalar

So all of these point to the same object:
Method 1: f['digitStruct']['name'][0][0]
Method 2: f['digitStruct']['name'].value[0].item()
And are both the same as f['#refs#']['b'] or f['/#refs#/b'] for this example.

Like Method 1, getting the string is Python and NumPy fuctionality.

f[f['digitStruct']['name'].value[0].item()][:].tostring().decode('ascii')

Yes, object references are complicated….
My recommendation:
Extract NumPy arrays from objects using NumPy indexing instead of .value (as shown in Modified Method 1 above).

Example code for completeness. Intermediate print statements used to show what’s going on.

import h5py

# Both of these methods read the name of the 1st
# image in svhn dataset
f = h5py.File('test_digitStruct.mat','r')
print (f['digitStruct'])
print (f['digitStruct']['name'])
print (f['digitStruct']['name'].name)

# method 1
print('\ntest method 1')
print (f[f['digitStruct']['name'][0][0]])
print (f[f['digitStruct']['name'][0][0]].name)
#  both of these get the entire array / filename:
print (f[f['digitStruct']['name'][0][0]].value)
print (f[f['digitStruct']['name'][0][0]][:]) # same as .value above
print (f[f['digitStruct']['name'][0][0]][:].tostring().decode('ascii'))

# method 2
print('\ntest method 2')
print (f[f['digitStruct']['name'].value[0].item()]) 
print (f[f['digitStruct']['name'].value[0].item()].name) 

# this only gets the first array member / character:
print (f[f['digitStruct']['name'].value[0].item()].value[0][0])
print (f[f['digitStruct']['name'].value[0].item()].value[0][0].tostring().decode('ascii'))
#  this gets the entire array / filename:
print (f[f['digitStruct']['name'].value[0].item()][:])
print (f[f['digitStruct']['name'].value[0].item()][:].tostring().decode('ascii'))

Output from last 2 print statements for each method is identical:

[[ 49]
 [ 46]
 [112]
 [110]
 [103]]
1.png

Leave a Comment