Is it possible to selecting dataset by time range when range is different for every pixel in pythons xarray module

Question:

I try to select only this part of the data within a specific time range that is different for every pixel.

For indexing, I have two np.datetime64[ns] xr.DataArrays with shape(lat:152, lon:131) named time_range_min, time_range_max
One is holding the start dates and the other one the end dates.

I try this for selecting the data

dataset = data.sel(time=slice(time_range_min, time_range_max))

but it yields

cannot use non-scalar arrays in a slice for xarray indexing:
<xarray.DataArray ‘NDVI’ (lat: 152, lon: 131)>

If I cannot use non-scalar arrays it means that it is in general not possible to do this, or can I transform my arrays?

Asked By: till Kadabra

||

Answers:

If "time" is a list of dates in string that is ordered from past to present (e.g. ["10-20-2021", "10-21-2021", …]:

import numpy as np
listOfMinMaxTimeRanges = [time_range_min, time_range_max]
specifiedRangeOfTimeIndexedList = []
for indexingListOfMinMaxTimeRanges in range(listOfMinMaxTimeRanges.shape[1]):
    specifiedRangeOfTimeIndexed = [specifiedRangeOfTime for specifiedRangeOfTime in np.arange(0, len(time), 1) if time.index(listOfMinMaxTimeRanges[0][indexingListOfMinMaxTimeRanges]) <= specifiedRangeOfTime <= time.index(listOfMinMaxTimeRanges[1][indexingListOfMinMaxTimeRanges])]
    for indexes in range(len(specifiedRangeOfTimeIndexed)):
        specifiedRangeOfTimeIndexedList.append(specifiedRangeOfTimeIndexed[indexes])

Depending on how your dataset is structured:

dataset = data.sel(time = specifiedRangeOfTimeIndexedList)

or

dataset = data.sel(time = time[specifiedRangeOfTimeIndexedList])

or

dataset = dataset[time[specifiedRangeOfTimeIndexedList]]

or

dataset = dataset[:, time[specifiedRangeOfTimeIndexedList]]

or

dataset = dataset[time[specifiedRangeOfTimeIndexedList], :, :]

or

dataset = dataset[specifiedRangeOfTimeIndexedList]

Answered By: Ori Yarden

I found a way to group every cell with stacking in xarray:
time_range_min and time_range_max marks now a single date

stack = dataset.value.stack(gridcell=['lat', 'lon'])
for unique_value, grouped_array in stack.groupby('gridcell'):
    grouped_array.sel(time=slice(time_range_min,time_range_max))
    
Answered By: till Kadabra