Why fillna have no effect after several operators on dataframe series?

Question:

I have the dataframe which look like this:

df = pd.DataFrame({'Event': ['A', 'B', 'A', 'A', 'B', 'C', 'B', 'B', 'A', 'C'], 
                   'Direction': ['UP', 'DOWN', 'UP', 'UP', 'DOWN', 'DOWN', 'DOWN', 'UP', 'DOWN', 'UP'],
                   'group':[1,2,3,3,3,4,4,4,5,5]})

Everything works fine, when i do:

df['prev'] = df[(df.Event == 'A') & (df.Direction == 'UP')].groupby('group').cumcount().add(1)
df['prev'].fillna(0, inplace=True)

But if i do it in one line the fillna() function does not works:

df['prev'] = df[(df.Event == 'A') & (df.Direction == 'UP')].groupby('group').cumcount().add(1).fillna(0)

My questioni is: Why is that? And is there a way to do it in one line?

Asked By: Alex DiS

||

Answers:

I’m not exactly sure why it’s not working, but I have a rough idea. In your first idea, this is what is happening:

df['prev'] = df[...]...
df['prev'] = df['prev'].fillna(0)

Your second idea:

df['prev'] = df[...]....fillna(0)

This probably has something to do with placing fillna(0) on the whole dataframe and when transferred over to the new variable (column) prev, it will revert the 0.0 back to NaN.

Answered By: DialFrost

Because this part df[(df.Event == 'A') & (df.Direction == 'UP')] is filtering only rows for Event A and Direction UP so when you put the fillna(0) at the end, you are only replacing NaN in the filtered subset of rows and the rest will be filled with NaN because the column prev didn’t exist prebiously.

Also because the column prev didn’t exist prebiously, I think you cannot do this in a single line. What you are doing is create a whole column and modify only a subset of the same column which you would have to break in 2 steps.

Answered By: 99_m4n

Look at the output at this step:

print(df[(df.Event == 'A') & (df.Direction == 'UP')].groupby('group').cumcount().add(1))

# Output:
0    1
2    1
3    2
dtype: int64

Do you see any nan values to fill? Is adding .fillna(0) here going to do anything?


A one liner that would work:

df['prev'] = df.assign(prev = df[(df.Event == 'A') & (df.Direction == 'UP')].groupby('group').cumcount().add(1))['prev'].fillna(0)
Answered By: BeRT2me
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.