How to add variable (with lower dimensionality) to xarray datasets

Question:

I couldn’t really find the answer in the doc of xarray for my particular struggle. A solution might be helpful to other as well.

So here my issue: I’m trying to use xarray in a fluid way like I use to do with nested dictionaries, structuring them according to my experimental protocol: that is reflecting the raw experimental data labeling. Add new variable (with lower dimensionality) on the fly, derived from raw data under the same labeling scheme.

For this example, Let say I have a sample of measurements represented here:
data representation in excel sheet

one big matrix with 9 sub-matrices of 36 points each names as follow: a1 b1 c1, row 2: a2, b2, c2, row 3: a3, b3, c3. Each point within a sub-matrix is indexed from row 1 to 6, col 1 to 6.

I have multiple big matrix labeled A, B, C, etc that have the same structure.

The xarray dataset should index the measurement variable ‘sag0’ by the coordinates : [x y row col]
now I want to add a variable that represent say the average value of all element on an 6×6 array located at [x, y]

I did as follow:

  1. load the excel sheet into panda
  2. create a set of inner key, outer key
  3. create an empty np array whose dimension corresponds to the xarray dimensions
  4. load sag0 values into dataset
  5. Add a new variable (here’s my problem.)
primArray_coord_label = ['A', 'B', 'C']
secdArray_coord_label = [ letter+number for letter in ['a', 'b','c'] for number in ['1', '2', '3']]
lens_number_coord_label = [number for number in np.arange(1,37,1)]
lens_number_coord_array_label = np.reshape(lens_number_coord_label,(6,6))
na = len(primArray_coord_label)
nb = len(secdArray_coord_label)
nc = len(lens_number_coord_label)

# ~~VV~~ we create the multi-dimensional array mirroring the nested dictionnary, 
# it will contain the value of the variable
multidimarray = np.empty((na, nb, 6,6))
for k, outerkey in enumerate(primArray_coord_label):
    for l, innerkey in enumerate(secdArray_coord_label):
        #print(outerkey, innerkey)
        values = nested_dict[outerkey][innerkey]
        # print(outerkey, innerkey, np.shape(values))
        # we create the multi-dimensional array containing the value of the variable
        multidimarray[k][l][:][:] = values

# ~~VV~~ we convert the multi-dimensional array into an xarray using the dictionnary key. 
ds= xr.Dataset({
        'sag0':
        (['x', 'y', 'row', 'col'],
        multidimarray, 
        {'description':'lens sag value after etching', 'units': '$mum$,'}),
    },coords={
    'x':('x',primArray_coord_label,{'description':'primary zone of 9 matrices on sample'} ),
    'y':('y',secdArray_coord_label, {'description':'ID of one matrix of 36 lens part of a zone'}),
    'row':('row',np.arange(1,7,1)),
    'col':('col',np.arange(1,7,1)),
    }
)
ds

so far this looks okay:

data representation in xarray

I want to add new variables: mean value over a all 36 points of an array
I want it to be accessible with the same coordinate x y…

This is what I tried:


def mean_of_array(values):
    return values.mean()

for x in ds['x']:
    for y in ds['y']:
        mean_val=mean_of_array(ds.sag0.sel(x=x, y=y))
        print(mean_val)
        ds = ds.assign(mean_val=mean_val)
ds

<xarray.DataArray 'sag0' ()>
array(49.51944444)
Coordinates:
    x        <U1 'A'
    y        <U2 'a1'

but I failed to get it stored in my xarray:
the new variable is there but empty and has no coordinates.

xarray with empty new variable

How do I do this?? I’m not sure if this is the right approach at all..

Asked By: SamB034

||

Answers:

Rather than looping over x and y, you can take the mean over a subset of dimensions by providing a list-like argument for dims in any reduction operation, including mean:

ds["mean_val"] = ds.sag0.mean(dim=("row", "col"))

See the xarray docs on aggregation for more information.

Answered By: Michael Delgado
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.