change the value of column to the maximum value above it in the same column

Question:

This is my dataframe:

df = pd.DataFrame({'a': [100, 103, 101, np.nan, 105, 107, 100]})

And this is the output that I want:

       a    b
0  100.0    100
1  103.0    103
2  101.0    103
3    NaN    103
4  105.0    105
5  107.0    107
6  100.0    107

I want to create column b which takes values of column a and replace them with the maximum value that is on top of it.

For example when there is 103 in a I want to change all values to 103 until a greater number is in column a. That is why rows 2 and 3 are changed to 103 and since in row 4 there is a greater number than 103 I want to put that in column b until a greater number is in column a.

I have tried a couple of posts on stackoverflow. One of them was this answer. But still I couldn’t figure out how to do it.

Asked By: Amir

||

Answers:

Use Series.cummax with replace missing values by previous non NaNs by ffill:

df = pd.DataFrame({'a': [100, 103, 101, np.nan, 105, 107, 100]})

df['b'] = df['a'].ffill().cummax().astype(int)
#alternative
#df['b'] = df['a'].ffill(downcast='int').cummax()
print (df)
       a    b
0  100.0  100
1  103.0  103
2  101.0  103
3    NaN  103
4  105.0  105
5  107.0  107
6  100.0  107

If possible in real data first value is NaN:

df = pd.DataFrame({'a': [np.nan, 103, 101, np.nan, 105, 107, 100]})

df['b'] = df['a'].ffill().cummax().astype('Int64')
print (df)
       a     b
0    NaN  <NA>
1  103.0   103
2  101.0   103
3    NaN   103
4  105.0   105
5  107.0   107
6  100.0   107
Answered By: jezrael
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.