Comparing the value of a column with the previous value of a new column using Apply in Python (Pandas)

Question:

I have a dataframe with these values in column A:

df = pd.DataFrame(A,columns =['A'])

    A
0   0
1   5
2   1
3   7
4   0
5   2
6   1
7   3
8   0

I need to create a new column (called B) and populate it using next conditions:

Condition 1: If the value of A is equal to 0 then, the value of B must be 0.

Condition 2: If the value of A is not 0 then I compare its value to the previous value of B. If A is higher than the previous value of B then I take A, otherwise I take B.
The result should be this:

    A   B
0   0   0
1   5   5
2   1   5
3   7   7
4   0   0
5   2   2
6   1   2
7   3   3

The dataset is huge and using loops would be too slow. I would need to solve this without using loops and the pandas “Loc” function. Anyone could help me to solve this using the Apply function? I have tried different things without success.

Thanks a lot.

Asked By: Martingale

||

Answers:

Use .shift() to shift your one cell down and check if the previous value is smaller and it is not 0. Then use .mask() to replace the values with the previous if the condition stands.

from io import StringIO
import pandas as pd
wt = StringIO("""A
0   0
1   2
2   3
3   1
4   2
5   7
6   0
""")

df = pd.read_csv(wt, sep='ss+')
df
   A
0  0
1  2
2  3
3  1
4  2
5  7
6  0

def func(df, col):
    df['B'] = df[col].mask(cond=((df[col].shift(1) > df[col]) & (df[col] != 0)), other=df[col].shift(1))
    if col == 'B':
        while ((df[col].shift(1) > df[col]) & (df[col] != 0)).any():
            df['B'] = df[col].mask(cond=((df[col].shift(1) > df[col]) & (df[col] != 0)), other=df[col].shift(1))
    return df

(df.pipe(func, 'A').pipe(func, 'B'))

Output:

   A  B
0  0  0
1  2  2
2  3  3
3  1  3
4  2  3
5  7  7
6  0  0
Answered By: ali bakhtiari

Try this:

df['B'] = df['A'].shift()
df['B'] = df.apply(lambda x:0 if x.A == 0 else x.A if x.A > x.B else x.B, axis=1)
Answered By: Arya Sadeghi

One way to do this I guess could be the following

def do_your_stuff(row):
    global value
    # fancy stuff here
    value = row["b"]
    [...]

value = df.iloc[0]['B']
df["C"] = df.apply(lambda row: do_your_stuff(row), axis=1)
Answered By: Achille G

Using the solution of Achille I solved it this way:

import pandas as pd
  
A = [0,2,3,0,2,7,2,3,2,20,1,0,2,5,4,3,1]
df = pd.DataFrame(A,columns =['A'])

df['B'] = 0

def function(row):
    global value
    global prev 

    if row['A'] ==0:
        value = 0
    elif row['A'] > value:
        value = row['A']
    else:
        value = prev
        
    prev = value
    return value

value = df.iloc[0]['B']
prev = value

df["B"] = df.apply(lambda row: function(row), axis=1)
df
Answered By: Martingale
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.