Is there in python a single function that shows the full structure of a .hdf5 file?
Question:
When opening a .hdf5
file, one can explore the levels, keys and names of the file
in different ways. I wonder if there is a way or a function that displays all the available paths to explore in the .hdf5
. Ultimately showing the whole tree.
Answers:
You can also get the file schema/contents without writing any Python code or installing additional packages. If you just want to see the entire schema, take a look at the h5dump
utility from The HDF Group. There are options to control the amount of detail that is dumped. Note: the default option is dump everything. To get a quick/small dump, use :h5dump -n 1 --contents=1 h5filename.h5
.
Another Python pakcage is PyTables. It has a utility ptdump
that is a command line tool to interrogate a HDF file (similar to h5dump
above).
Finally, here are some tips if you want to programmatically access groups and datasets recursively in Python. h5py
and tables
(PyTables) each have methods to do this:
In h5py:
Use the object.visititems(callable)
method. It calls the callable function for each object in the tree.
In PyTables:
PyTables has multiple ways to recursively access groups, datasets and nodes. There are methods that return an iterable (object.walk_nodes
), or return a list (object.list_nodes
). There is also a method that returns an iterable that is not recursive (object.iter_nodes
).
For all, who want to stay with the h5py package:
This is not a one-liner from implementation perspective, but it works with the h5py package. With this recursive function you can use it as one-liner:
import h5py
filename_hdf = 'data.hdf5'
def h5_tree(val, pre=''):
items = len(val)
for key, val in val.items():
items -= 1
if items == 0:
# the last item
if type(val) == h5py._hl.group.Group:
print(pre + '└── ' + key)
h5_tree(val, pre+' ')
else:
print(pre + '└── ' + key + ' (%d)' % len(val))
else:
if type(val) == h5py._hl.group.Group:
print(pre + '├── ' + key)
h5_tree(val, pre+'│ ')
else:
print(pre + '├── ' + key + ' (%d)' % len(val))
with h5py.File(filename_hdf, 'r') as hf:
print(hf)
h5_tree(hf)
When opening a .hdf5
file, one can explore the levels, keys and names of the file
in different ways. I wonder if there is a way or a function that displays all the available paths to explore in the .hdf5
. Ultimately showing the whole tree.
You can also get the file schema/contents without writing any Python code or installing additional packages. If you just want to see the entire schema, take a look at the h5dump
utility from The HDF Group. There are options to control the amount of detail that is dumped. Note: the default option is dump everything. To get a quick/small dump, use :h5dump -n 1 --contents=1 h5filename.h5
.
Another Python pakcage is PyTables. It has a utility ptdump
that is a command line tool to interrogate a HDF file (similar to h5dump
above).
Finally, here are some tips if you want to programmatically access groups and datasets recursively in Python. h5py
and tables
(PyTables) each have methods to do this:
In h5py:
Use the object.visititems(callable)
method. It calls the callable function for each object in the tree.
In PyTables:
PyTables has multiple ways to recursively access groups, datasets and nodes. There are methods that return an iterable (object.walk_nodes
), or return a list (object.list_nodes
). There is also a method that returns an iterable that is not recursive (object.iter_nodes
).
For all, who want to stay with the h5py package:
This is not a one-liner from implementation perspective, but it works with the h5py package. With this recursive function you can use it as one-liner:
import h5py
filename_hdf = 'data.hdf5'
def h5_tree(val, pre=''):
items = len(val)
for key, val in val.items():
items -= 1
if items == 0:
# the last item
if type(val) == h5py._hl.group.Group:
print(pre + '└── ' + key)
h5_tree(val, pre+' ')
else:
print(pre + '└── ' + key + ' (%d)' % len(val))
else:
if type(val) == h5py._hl.group.Group:
print(pre + '├── ' + key)
h5_tree(val, pre+'│ ')
else:
print(pre + '├── ' + key + ' (%d)' % len(val))
with h5py.File(filename_hdf, 'r') as hf:
print(hf)
h5_tree(hf)