Calling arrays to be calculated by function

Question:

I have two functions: the "main" one, which runs the scripts and creates an H5-file to store data and one which provides the data by doing calculations between the two arrays I give it. How do I give the math function my arrays properly and have it calculate the data to my H5 file? I’m having trouble calling the aforementioned function in my math one.

How I create the H5 file:

with h5py.File('calculations.h5', 'w') as hf:
        if not hf.__contains__(pre):
            hf.create_group(pre)
        if not hf[pre].__contains__('array_one'):
            hf[pre].create_group('array_one')
        if not hf[pre].__contains__('array_two'):
            hf[pre].create_group('array_two')
        for dim in aver.__dict__.keys():
            if not dim == 't':
                for key in aver.__getattribute__(pre).__dict__.keys():
                    if not hf[pre]['array_one'].__contains__(dim):
                        hf[pre]['array_one'].create_group(dim)
                    if not hf[pre]['array_one'][dim].__contains__(key[:-2]):
                        hf[pre]['array_one'][dim].create_dataset(key[:-2].lower(),
                        data=aver.__getattribute__(dim).__getattribute__(key))
                    if not hf[pre]['array_two'].__contains__(dim):
                        hf[pre]['array_two'].create_group(dim)
                    if not hf[pre]['array_two'][dim].__contains__(key[:-2]):
                        hf[pre]['array_two'][dim].create_dataset(key[:-2].lower(),
                        data=calc.__getattribute__(key[:-2].lower()))
                    arrone = hf[pre]['array_one'][dim]
                    arrtwo = hf[pre]['array_two'][dim]
        if relerr:
            if not hf[pre].__contains__('relerrors'):
                hf[pre].create_group('relerrors')
            for dim in hf[pre]['array_one'].keys():
                if not hf[pre]['relerrors'].__contains__(dim):
                    hf[pre]['relerrors'].create_group(dim)
                for key in hf[pre]['array_one'][dim].keys():
                    reler = relerror(arrone,arrtwo)
                    hf[pre]['relerrors'][dim].create_dataset(key+"_relerror",data=reler)

My math function:

import numpy as np
from numpy.linalg import norm
from sklearn.metrics import mean_absolute_error as mae

def relerror(arrone,arrtwo,relerr=True):
relone=arrone.copy()
reltwo=arrtwo.copy()
atmp=np.ma.array(arrtwo)
atmp[atmp==0]=np.ma.masked
if relerr:
    relone[atmp.mask==True] = arrone[atmp.mask==True]
    relone[atmp.mask==False] = arrone[atmp.mask==False]/np.abs(arrtwo[atmp.mask==False])
    reltwo[atmp.mask==False] = arrtwo[atmp.mask==False]/np.abs(arrtwo[atmp.mask==False])
return mae(relone, reltwo)

EDIT: Adding [()] as @kcw78 proposed, and [key[:-2].lower()][()] now calls the arrays in a proper way.

Asked By: eulinean

||

Answers:

I was going to comment this but it was a bit too long.

Is this code indented correctly? If not, fix it. Then, after you define your function, you have to call it:

import numpy as np
from numpy.linalg import norm
from sklearn.metrics import mean_absolute_error as mae

################################################################################# 
#Your h5py code here, it's too long and I don't want to include it in the answer#
################################################################################# 

def relerror(arrone,arrtwo,relerr):
    relone=arrone.copy()
    reltwo=arrtwo.copy()
    atmp=np.ma.array(arrtwo)
    atmp[atmp==0]=np.ma.masked
    if relerr:
        relone[atmp.mask==True] = arrone[atmp.mask==True]
        relone[atmp.mask==False] = arrone[atmp.mask==False]/np.abs(arrtwo[atmp.mask==False])
        reltwo[atmp.mask==False] = arrtwo[atmp.mask==False]/np.abs(arrtwo[atmp.mask==False])
    return mae(relone, reltwo)

maeResult = relerror(arrone, arrtwo, True)
print(maeResult)
Answered By: Michael S.

The answer posted by @Micheal S should address your question about calling a function and returning the values.

Your newest question has to do with the way you are reading the HDF5 dataset. The way you access the dataset returns a h5py dataset object: arrone = hf[pre]['array_one'][dim]. Dataset objects "behave like" numpy arrays in many ways. However in your case, you need an array. To do this, add [()], like this: arrone = hf[pre]['array_one'][dim][()] or
arrone = hf[f'{pre}/array_one/{dim}'][()] (to use a f-string to define the path)

The rest of this answer addresses h5py usage to create the HDF5 file. You have a lot of code to check group names and create the ones that don’t exist. You don’t need to do that — use require_group() instead. It works like this:

hf.require_group(f'{pre}/array_one') and the same for array_two and dim groups.

There is also a require_dataset() function to do the same for datasets.

Next, you don’t need to call the dunder __contains__ method. Instead, use if 'array_one' not in hf:.

Finally, it’s not clear what you are trying to do with __getattribute__(). If you are trying to get a group or dataset attribute, there is an easier way with the .attr() method. It uses the same dictionary syntax.

This is my attempt to pull all of this together:
Improved method to create the H5 file:

with h5py.File('calculations.h5', 'w') as hf:
        hf.require_group(f'{pre}/array_one')
        hf.require_group(f'{pre}/array_two')
        for dim in aver.keys():  # aver undefined: assumes this is a group object
            if not dim == 't':
                for key in aver.__getattribute__(pre).__dict__.keys():
                    hf[pre]['array_one'].require_group(dim)
                    if  key[:-2] not in hf[pre]['array_one'][dim]:
                        hf[pre]['array_one'][dim].create_dataset(key[:-2].lower(),
                        data=aver.__getattribute__(dim).__getattribute__(key))
                    hf[pre]['array_two'].require_group(dim)
                    if key[:-2] not in hf[pre]['array_two'][dim]:
                        hf[pre]['array_two'][dim].create_dataset(key[:-2].lower(),
                        data=calc.__getattribute__(key[:-2].lower()))
                    arrone = hf[pre]['array_one'][dim][:]
                    arrtwo = hf[pre]['array_two'][dim][:]
        if relerr:
            hf[pre].require_group('relerrors')
            for dim in hf[pre]['array_one'].keys():
                hf[pre]['relerrors'].require_group(dim)
                for key in hf[pre]['array_one'][dim].keys():
                    reler = relerror(arrone,arrtwo)
                    hf[pre]['relerrors'][dim].create_dataset(key+"_relerror",data=reler)
Answered By: kcw78
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.