replace value at specific coordinate with mean of sorrounding cells in Xarray
Question:
I’ve been trying to replace the value of a cell (meaning, a point with an specific latitude and longitude) in a xarray dataset. The coordinares of my Dataset are lon, lat and time. I want to replace the value of certain lat and lon with the mean of the adjacent cells across time.
So far I’ve managed to replace it with an scalar (e.g. ds.loc[{'lon': long, 'lat': lat}]['qtot'] = 1
, but when I try to replace it with an array or dataarray it seems ok but it does not update the value in the original dataset. I’ve tried the following expressions:
ds.loc[{'lon': lon, 'lat': lat}]['qtot'] = ds.loc[{'lon': slice(lon-0.5, lon+0.5), 'lat': slice(lat+0.5, lat-0.5)}]['qtot'].mean(['lat', 'lon'])
ds.loc[{'lon': lon, 'lat': lat}]['qtot'].values = ds.loc[{'lon': slice(lon-0.5, lon+0.5), 'lat': slice(lat+0.5, lat-0.5)}]['qtot'].mean(['lat', 'lon']).values
Any ideas would be much appreciated.
Answers:
In xarray, you can use .loc[]
to assign values to DataArrays or Datasets:
# all of these work
ds.loc[{'lon': lon, 'lat': lat}] = 1
ds.loc[{'lon': lon, 'lat': lat}] = ds.loc[{'lon': lon, 'lat': lat}].mean()
ds['qstat'].loc[{'lon': lon, 'lat': lat}] = 1
ds['qstat'].loc[{'lon': lon, 'lat': lat}] = ds['qstat'].loc[{'lon': lon, 'lat': lat}].mean()
However, using .loc[]
to get a view into a subset a Dataset, then referencing a variable from the view, then assigning to that does seem to break xarray’s assignment handling:
# both of these have no effect
ds.loc[{'lon': lon, 'lat': lat}]['qstat'] = 1
ds.loc[{'lon': lon, 'lat': lat}]['qstat'] = ds.loc[{'lon': lon, 'lat': lat}]['qstat'].mean()
This could possibly be a bug, but it’s not a pattern I’ve seen used often, and the more standard way of referencing an entire variable prior to slicing & assignment does work.
This seems like it could be analagous to the pandas SettingWithCopyWarning
case, where a chained assignment along the lines of df.loc[slicer][colname] = 1
does not modify df
.
I’ve been trying to replace the value of a cell (meaning, a point with an specific latitude and longitude) in a xarray dataset. The coordinares of my Dataset are lon, lat and time. I want to replace the value of certain lat and lon with the mean of the adjacent cells across time.
So far I’ve managed to replace it with an scalar (e.g. ds.loc[{'lon': long, 'lat': lat}]['qtot'] = 1
, but when I try to replace it with an array or dataarray it seems ok but it does not update the value in the original dataset. I’ve tried the following expressions:
ds.loc[{'lon': lon, 'lat': lat}]['qtot'] = ds.loc[{'lon': slice(lon-0.5, lon+0.5), 'lat': slice(lat+0.5, lat-0.5)}]['qtot'].mean(['lat', 'lon'])
ds.loc[{'lon': lon, 'lat': lat}]['qtot'].values = ds.loc[{'lon': slice(lon-0.5, lon+0.5), 'lat': slice(lat+0.5, lat-0.5)}]['qtot'].mean(['lat', 'lon']).values
Any ideas would be much appreciated.
In xarray, you can use .loc[]
to assign values to DataArrays or Datasets:
# all of these work
ds.loc[{'lon': lon, 'lat': lat}] = 1
ds.loc[{'lon': lon, 'lat': lat}] = ds.loc[{'lon': lon, 'lat': lat}].mean()
ds['qstat'].loc[{'lon': lon, 'lat': lat}] = 1
ds['qstat'].loc[{'lon': lon, 'lat': lat}] = ds['qstat'].loc[{'lon': lon, 'lat': lat}].mean()
However, using .loc[]
to get a view into a subset a Dataset, then referencing a variable from the view, then assigning to that does seem to break xarray’s assignment handling:
# both of these have no effect
ds.loc[{'lon': lon, 'lat': lat}]['qstat'] = 1
ds.loc[{'lon': lon, 'lat': lat}]['qstat'] = ds.loc[{'lon': lon, 'lat': lat}]['qstat'].mean()
This could possibly be a bug, but it’s not a pattern I’ve seen used often, and the more standard way of referencing an entire variable prior to slicing & assignment does work.
This seems like it could be analagous to the pandas SettingWithCopyWarning
case, where a chained assignment along the lines of df.loc[slicer][colname] = 1
does not modify df
.