Updating pandas column faster than for loop

Question:

I need to update a dataframe column with an additional comparison. I manage this with a for loop with a couple of conditions

import pandas as pd

df = pd.DataFrame({'Signal':[1,1,1,0,0,0,0,0,0,1],'F1':[5,5,5,5,5,5,5,5,5,5],'F2':[5,5,5,5,5,6,4,4,4,4]}) 

for i in range(1,len(df)):
    if (df['Signal'].iloc[i-1] == 1) & (df['F1'].iloc[i]<=(df['F2']).iloc[i]):
        df['Signal'].iloc[i] = 1

The for loop checks the previous state, then checks my condition and updates the "Signal" column. In a dataframe with a few thousand rows, this operations begins to take more than I’d like to. I’m looking to optimize the code, but not sure how.

So far I have this list comprehension that give me the values of the update, but not the position where I should update. Also unsure if it is a faster solution than my loop

[1 for i in range(1,len(df)) if (df['Signal'].iloc[i-1] == 1) & (df['F1'].iloc[i]<=(df['F2']).iloc[i]) ]
Asked By: mableguy

||

Answers:

Code

cond1 = df['F1'] <= df['F2']
grp = cond1.ne(cond1.shift()).cumsum()
s1 = df['Signal']
cond2 = (s1.eq(1) | s1.shift().eq(1))
s2 = cond2.where(cond2).groupby(grp).ffill().fillna(0).astype('int')
df.assign(Signal=s1.mask(cond1, s2))

output:

Signal  F1  F2
0   1   5   5
1   1   5   5
2   1   5   5
3   1   5   5
4   1   5   5
5   1   5   6
6   0   5   4
7   0   5   4
8   0   5   4
9   1   5   4
Answered By: Panda Kim
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.