python pandas dataframe : fill nans with a conditional mean of previous and next value

Question

I have the following dataframe:

| col1 | col2 | col3 |
| 5    | 3    |  9   | 
| NaN  | 6    |  NaN |
| NaN  | 3    |  7   |
| 7    | 8    |  5   |
| NaN  | 3    |  NaN |
| 2    | 2    |  4   |

And I want value NaN to be filled with the conditional mean of previous and next value based on the same column.

| col1 | col2 | col3 |
| 5    | 3    |  9   | 
| 6    | 6    |  8   |
| 6    | 3    |  7   |
| 7    | 8    |  5   |
| 4.5  | 3    |  4.5 |
| 2    | 2    |  4   |

Just like this, value 6 is the mean with 5 and 7. And this is a little part of my dataframe, so I need to replace all the NaN.

Asked By: heng xia

||

Source

Answer 1

EDIT:

For replace missing values in all columns use:

df = df.bfill().add(df.ffill()).div(2)

If need repalce only some columns, e.g. numeric:

cols = df.select_dtypes(np.number).columns

df[cols] = df[cols].bfill().add(df[cols].ffill()).div(2)

Use:

df = pd.DataFrame({'col':[1,15.6,np.nan, np.nan, 15.8,5, 
                          np.nan, 4,10, np.nan, np.nan,np.nan, 7]})

#filter non missing values
m = df['col'].notna()

#count 2 consecutive NaNs
m = df.groupby(m.cumsum()[~m])['col'].transform('size').eq(2)

#expand mask to previous and next values for consecutive 2 NaNs
mask = m.shift(fill_value=False) | m.shift(-1, fill_value=False)
print (mask)
0     False
1      True
2      True
3      True
4      True
5     False
6     False
7     False
8     False
9     False
10    False
11    False
12    False
Name: col, dtype: bool

#for filtered rows create means
df.loc[mask, 'col'] = df.loc[mask, 'col'].bfill().add(df.loc[mask, 'col'].ffill()).div(2)
print (df)
     col
0    1.0
1   15.6
2   15.7
3   15.7
4   15.8
5    5.0
6    NaN
7    4.0
8   10.0
9    NaN
10   NaN
11   NaN
12   7.0

If need means for all missing values remove mask:

df['col'] = df['col'].bfill().add(df['col'].ffill()).div(2)
print (df)
     col
0    1.0
1   15.6
2   15.7
3   15.7
4   15.8
5    5.0
6    4.5
7    4.0
8   10.0
9    8.5
10   8.5
11   8.5
12   7.0

Answered By: jezrael

python pandas dataframe : fill nans with a conditional mean of previous and next value

Question:

Answers: