getting the first row of two masks that meets conditions and creating a new column

Question:

This is my dataframe:

df = pd.DataFrame({'a': [20, 21, 333, 444, 1, 666], 'b': [20, 20, 20, 20, 20, 20], 'c': [222, 211, 2, 1, 100, 200]})

I want to use two masks. The first one finds the second row that a is greater than b. and creates column d. This mask is:

mask = (df.a >= df.b)
df.loc[mask.cumsum().eq(2) & mask, 'd'] = 'x'

Now I want to add another mask. Basically what I want is to find the first row that has two conditions.

a) It is after the first mask (That is, it is after the second row that a >= b)

b) Column c is greater than column b

My desired output is as follows:

     a   b    c    d
0   20  20  222    NaN
1   21  20  211    NaN
2  333  20    2    NaN
3  444  20    1    NaN
4  1    20  100    x
5  666  20  200    NaN

I tried a couple of ways but the fact that it has to be after the first mask made it difficult for me.

Asked By: Amir

||

Answers:

You can try the following monstrosity:

mask2 = (mask.cumsum().eq(2) & mask) # or even just mask.cumsum().eq(2), & mask seems pointless here
df.loc[(mask2.cumsum().ge(1) & ~mask2 & (df.c >= df.b)).cumsum().eq(1), 'd'] = 'x' 

Though probably someone smart will have a better way =)

Answered By: Guru Stron

With single expression and pandas.Series.argmax:

df.loc[(mask.cumsum().gt(2) & (df['c'] > df['b'])).argmax(), 'd'] = 'x'

     a   b    c    d
0   20  20  222  NaN
1   21  20  211  NaN
2  333  20    2  NaN
3  444  20    1  NaN
4  555  20  100    x
5  666  20  200  NaN
Answered By: RomanPerekhrest
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.