xarray mask outside list of coordinates

Question:

I have an Xarray DataArray with values over rectangular 2D grid, and a list of points (pairs of coordinate values) from an arbitrary subset of that grid contained in a pandas Dataframe.

How do I mask out values (i.e. set equal to NaN) in the DataArray whose grid coordinates do not appear in the list points?

e.g. consider the DataArray

In [35]: da = xr.DataArray(data=np.random.randint(10, size=(5, 6)), coords={"x": np.linspace(0, 10, 5), "y": np.linspace(0, 12, 6)})

In [36]: da
Out[36]: 
<xarray.DataArray (x: 5, y: 6)>
array([[6, 0, 2, 3, 9, 8],
       [7, 6, 4, 8, 5, 8],
       [7, 4, 4, 5, 4, 7],
       [9, 8, 8, 1, 8, 0],
       [8, 9, 4, 3, 3, 6]])
Coordinates:
  * x        (x) float64 0.0 2.5 5.0 7.5 10.0
  * y        (y) float64 0.0 2.4 4.8 7.2 9.6 12.0

and dataframe

In [44]: coords = pd.DataFrame([[2.5, 4.8], [2.5, 7.2], [5.0, 12.0], [7.5, 7.2], [10.0, 2.4]], columns=["x_coord", "y_coord"])

In [45]: coords
Out[45]: 
   x_coord   y_coord
0      2.5       4.8
1      2.5       7.2
2      5.0      12.0
3      7.5       7.2
4     10.0       2.4

then I expect the output to be:

Out[84]: 
<xarray.DataArray (x: 5, y: 6)>
array([[nan, nan, nan, nan, nan, nan],
       [nan, nan,  4.,  8., nan, nan],
       [nan, nan, nan, nan, nan,  7.],
       [nan, nan, nan,  1., nan, nan],
       [ 8., nan, nan, nan, nan, nan]])
Coordinates:
  * x        (x) float64 0.0 2.5 5.0 7.5 10.0
  * y        (y) float64 0.0 2.4 4.8 7.2 9.6 12.0
Asked By: ogb119

||

Answers:

You can convert the dataframe to an xarray object by setting the x and y coordinates as the index, then using to_xarray. since you don’t have any data left, I’ll just assign a "flag" variable:

In [20]: flag = (
    ...:     coords.assign(flag=1)
    ...:     .set_index(["x_coord", "y_coord"])
    ...:     .flag
    ...:     .to_xarray()
    ...:     .fillna(0)
    ...:     .rename({"x_coord": "x", "y_coord": "y"})
    ...: )

In [21]: flag
Out[21]:
<xarray.DataArray 'flag' (x: 4, y: 4)>
array([[0., 1., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 1., 0.],
       [1., 0., 0., 0.]])
Coordinates:
  * x        (x) float64 2.5 5.0 7.5 10.0
  * y        (y) float64 2.4 4.8 7.2 12.0

To deal with floating point issues, I’ll reindex the array to ensure the dims are consistent with the arrays:

In [22]: flag = flag.reindex(x=da.x, y=da.y, method="nearest", tolerance=1e-9, fill_value=0)

In [23]: flag
Out[23]:
<xarray.DataArray 'flag' (x: 5, y: 6)>
array([[0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 1., 0., 0.],
       [0., 0., 0., 0., 0., 1.],
       [0., 0., 0., 1., 0., 0.],
       [0., 1., 0., 0., 0., 0.]])
Coordinates:
  * x        (x) float64 0.0 2.5 5.0 7.5 10.0
  * y        (y) float64 0.0 2.4 4.8 7.2 9.6 12.0

This is now the same shape as your array and can can be used as a mask:

In [24]: da.where(flag)
Out[24]:
<xarray.DataArray (x: 5, y: 6)>
array([[nan, nan, nan, nan, nan, nan],
       [nan, nan,  7.,  0., nan, nan],
       [nan, nan, nan, nan, nan,  5.],
       [nan, nan, nan,  8., nan, nan],
       [nan,  8., nan, nan, nan, nan]])
Coordinates:
  * x        (x) float64 0.0 2.5 5.0 7.5 10.0
  * y        (y) float64 0.0 2.4 4.8 7.2 9.6 12.0

Just in case it’s useful, if you wanted to do the opposite; that is, extract the values from the DataArray at the points given in your dataframe, you could use xarray’s advanced indexing rules to pull specific points out of the array using DataArray indexers:

In [28]: da.sel(
    ...:     x=coords.x_coord.to_xarray(),
    ...:     y=coords.y_coord.to_xarray(),
    ...:     method="nearest",
    ...:     tolerance=1e-9, # use a (low) tolerance to handle floating-point error
    ...: )

Out[28]:
<xarray.DataArray (index: 5)>
array([7, 0, 5, 8, 8])
Coordinates:
    x        (index) float64 2.5 2.5 5.0 7.5 10.0
    y        (index) float64 4.8 7.2 12.0 7.2 2.4
  * index    (index) int64 0 1 2 3 4
Answered By: Michael Delgado
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.