How to proceed with `None` value in pandas fillna

Question:

I have the following dictionary:

fillna(value={'first_name':'Andrii', 'last_name':'Furmanets', 'created_at':None})

When I pass that dictionary to fillna I see:

raise ValueError(‘must specify a fill method or value’)nValueError: must specify a fill method or valuen”

It seems to me that it fails on None value.

I use pandas version 0.20.3.

Asked By: Andrii Furmanets

||

Answers:

What type of data structure are you using? This works for a pandas Series:

import pandas as pd

d = pd.Series({'first_name': 'Andrii', 'last_name':'Furmanets', 'created_at':None})
d = d.fillna('DATE')
Answered By: atwalsh

Setup
Consider the sample dataframe df

df = pd.DataFrame(dict(A=[1, None], B=[None, 2], C=[None, 'D']))

df

     A    B     C
0  1.0  NaN  None
1  NaN  2.0     D

I can confirm the error

df.fillna(dict(A=1, B=None, C=4))
ValueError: must specify a fill method or value

This happens because pandas is cycling through keys in the dictionary and executing a fillna for each relevant column. If you look at the signature of the pd.Series.fillna method

Series.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)

You’ll see the default value is None. So we can replicate this error with

df.A.fillna(None)

Or equivalently

df.A.fillna()

I’ll add that I’m not terribly surprised considering that you are attempting to fill a null value with a null value.


What you need is a work around

Solution
Use pd.DataFrame.fillna over columns that you want to fill with non-null values. Then follow that up with a pd.DataFrame.replace on the specific columns you want to swap one null value with another.

df.fillna(dict(A=1, C=2)).replace(dict(B={np.nan: None}))

     A     B  C
0  1.0  None  2
1  1.0     2  D
Answered By: piRSquared

An alternative method to fillna with None. I am on pandas 0.24.0 and I am doing this to insert NULL values to POSTGRES database.

# Stealing @pIRSquared dataframe
df = pd.DataFrame(dict(A=[1, None], B=[None, 2], C=[None, 'D']))

df

     A    B     C
0  1.0  NaN  None
1  NaN  2.0     D

# fill NaN with None. Basically it says, fill with None whenever you see NULL value.
df['A'] = np.where(df['A'].isnull(), None, df['A'])
df['B'] = np.where(df['B'].isnull(), None, df['B'])

# Result
df

     A    B     C
0  1.0  None  None
1  None  2.0     D

Answered By: addicted

It’s a bad idea to try to fill a datetime with None, this is exactly what pandas NaT (NotATime), is for: for missing datetimes.

Answered By: smci

In case you want to normalize all of the nulls with python’s None.

df.fillna(np.nan).replace([np.nan], [None])

The first fillna will replace all of (None, NAT, np.nan, etc) with Numpy’s NaN, then replace Numpy’s NaN with python’s None.

Answered By: AsaridBeck91
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.