Pandas fillna throws ValueError: fill value must be in categories

Question

Discription: both features are in categorical dtypes. and i used this code in a different kernal of same
dateset was working fine, the only difference is the features are in flote64. later i have converted these feature dtypes into Categorical
because all the features in the dataset represents categories.

Below is the code:

AM_train['product_category_2'].fillna('Unknown', inplace =True)
AM_train['city_development_index'].fillna('Missing', inplace =True)

Asked By: Ravi Varma

||

Source

Answer 1

Use Series.cat.add_categories for add categories first:

AM_train['product_category_2'] = AM_train['product_category_2'].cat.add_categories('Unknown')
AM_train['product_category_2'].fillna('Unknown', inplace =True) 

AM_train['city_development_index'] = AM_train['city_development_index'].cat.add_categories('Missing')
AM_train['city_development_index'].fillna('Missing', inplace =True)

Sample:

AM_train = pd.DataFrame({'product_category_2': pd.Categorical(['a','b',np.nan])})
AM_train['product_category_2'] = AM_train['product_category_2'].cat.add_categories('Unknown')
AM_train['product_category_2'].fillna('Unknown', inplace =True) 

print (AM_train)
  product_category_2
0                  a
1                  b
2            Unknown

Answered By: jezrael

Answer 2

I was getting the same error in a data frame while trying to get rid of all the NaNs.
I did not look too much into it, but substituting .fillna() for .replace(np.nan, value) did the trick.
Use with caution, since I am not sure np.nan catches all the values that are interpreted as NaN

Answered By: lotrus28

Answer 3

In my case, I was using fillna on a dataframe with many features when I got that error.

I preferred converting the necessary features to string first, using fillna and finally converting them back to category if needed.

AM_train['product_category_2'] = AM_train['product_category_2'].astype('string')
AM_train['product_category_2'].fillna('Unknown', inplace =True)
AM_train['product_category_2'] = AM_train['product_category_2'].astype('category')

It could also be automated, searching for all features having a dtype ‘category’ and converting them using the logic above.

Answered By: Yves

Answer 4

Load the original dataset without inplace=True, always before running the fillna secondtime.

This problem arises because, you run the code twice, so fillna cannot be performed.

Answered By: Arjun Goud

Pandas fillna throws ValueError: fill value must be in categories

Question:

Answers: