How to import netCDF4 file with xarray when index names have multiple dimensions?

Question:

When I try to import netCDF4 files using xarray I get the following error:

MissingDimensionsError: ‘name’ has more than 1-dimension and the same name as one of its dimensions (‘time’, ‘name’). xarray disallows such variables because they conflict with the coordinates used to label dimensions.

However, I can successfully import these data using the netCDF4 python library, and get the data I need from it. The problem is that this method is very slow, so I was looking for something faster and wanted to try xarray. Here is an example file, and the code that is giving me the bug in question.

from netCDF4 import Dataset
#import matplotlib
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np         
#import seaborn as sns
from tkinter import Tk

from tkinter.filedialog import askdirectory
import os
import xarray as xr

#use this function to get a directory name where the files are
def get_dat():
    root = Tk()
    root.withdraw()
    root.focus_force()
    root.attributes("-topmost", True)      #makes the dialog appear on top
    filename = askdirectory()      # Open single file
    root.destroy()
    root.quit()
    return filename

directory=get_dat()

#loop through files in directory and read the netCDF4 files
for filename in os.listdir(directory):     #loop through files in user's dir
    if filename.endswith(".nc"):     #all my files are .nc not .nc4
        runstart=pd.datetime.now()
        #I get the error right here
        rootgrp3 = xr.open_dataset(directory+'/'+filename)
        #more stuff happens here with the data, but this stuff works
Asked By: bart cubrich

||

Answers:

The issue is still currently valid. The problem arise when a coordinate has multiple dimensions and as the same name of one of those dimensions.

As an example, output files result.nc issued by the GOTM model have this problem for coordinates z and zi :

dimensions:
    time = UNLIMITED ; // (4018 currently)
    lon = 1 ;
    lat = 1 ;
    z = 218 ;
    zi = 219 ;
variables:
    ... 
    float z(time, z, lat, lon) ;
    float zi(time, zi, lat, lon) ;

It has been proposed here to implement a ‘rename_var’ kwarg to xr.open_dataset() as a work-around, but it hasn’t been implement yet, to my knowledge.

The quick workaround I use is to call nco-ncrename from python, where needed.

In my case :

 os.system('ncrename -v z,z_coord -v zi,zi_coord result.nc resultxr.nc')

This allows

 r2 = xr.open_dataset(testdir+'resultxr.nc')

while

 r = xr.open_dataset(testdir+'result.nc')

was failing.

Answered By: acapet

In ipython, this is a very easy workaround:

!ncrename -v name,name_matrix filename.nc #renaming variable name to prevent dimension/variable name conflict in xarray, requires nco in linux
Answered By: Tom F
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.