Is there in python a single function that shows the full structure of a .hdf5 file?

Question:

When opening a .hdf5 file, one can explore the levels, keys and names of the file
in different ways. I wonder if there is a way or a function that displays all the available paths to explore in the .hdf5. Ultimately showing the whole tree.

Asked By: nicdelillo

||

Answers:

Try using nexuformat package to list the structure of the hdf5 file.

Install by pip install nexusformat

Code

import nexusformat.nexus as nx
f = nx.nxload(‘myhdf5file.hdf5’)
print(f.tree)

This should print the entire structure of the file. For more on that see this thread. Examples can be found here

Answered By: AzyCrw4282

You can also get the file schema/contents without writing any Python code or installing additional packages. If you just want to see the entire schema, take a look at the h5dump utility from The HDF Group. There are options to control the amount of detail that is dumped. Note: the default option is dump everything. To get a quick/small dump, use :h5dump -n 1 --contents=1 h5filename.h5.

Another Python pakcage is PyTables. It has a utility ptdump that is a command line tool to interrogate a HDF file (similar to h5dump above).

Finally, here are some tips if you want to programmatically access groups and datasets recursively in Python. h5py and tables (PyTables) each have methods to do this:

In h5py:
Use the object.visititems(callable) method. It calls the callable function for each object in the tree.

In PyTables:
PyTables has multiple ways to recursively access groups, datasets and nodes. There are methods that return an iterable (object.walk_nodes), or return a list (object.list_nodes). There is also a method that returns an iterable that is not recursive (object.iter_nodes).

Answered By: kcw78

For all, who want to stay with the h5py package:

This is not a one-liner from implementation perspective, but it works with the h5py package. With this recursive function you can use it as one-liner:

import h5py

filename_hdf = 'data.hdf5'

def h5_tree(val, pre=''):
    items = len(val)
    for key, val in val.items():
        items -= 1
        if items == 0:
            # the last item
            if type(val) == h5py._hl.group.Group:
                print(pre + '└── ' + key)
                h5_tree(val, pre+'    ')
            else:
                print(pre + '└── ' + key + ' (%d)' % len(val))
        else:
            if type(val) == h5py._hl.group.Group:
                print(pre + '├── ' + key)
                h5_tree(val, pre+'│   ')
            else:
                print(pre + '├── ' + key + ' (%d)' % len(val))

with h5py.File(filename_hdf, 'r') as hf:
    print(hf)
    h5_tree(hf)
Answered By: Alex44
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.