Finding first occurrence of even numbers

Question:

This is my dataframe:

df = pd.DataFrame(
    {
        'a': [20, 21, 333, 55, 444, 1000, 900, 44,100, 200, 100],
        'b': [2, 2, 2, 4, 4, 4, 4, 3, 2, 2, 6]
    }
)

And this is the output that I want:

a b c
0 20 2 x
1 21 2 NaN
2 333 2 NaN
3 55 4 x
4 444 4 NaN
5 1000 4 NaN
6 900 4 NaN
7 44 3 NaN
8 100 2 x
9 200 2 NaN
10 100 6 x

I want to create column c which marks the first occurrence of an even number. It does not matter whether the even number is repeated consecutively or not. First occurrence is what I want.

For example the first row is marked because it is the first occurrence of 2 in column b. And the streak of 2 ends. Accordingly, that is why the first 4 is marked.

I tried this code:

def finding_first_even_number(df):
    mask = (df.b % 2 == 0)
    df.loc[mask.cumsum().eq(1) & mask, 'c'] = 'x'
    return df

df = df.groupby('b').apply(finding_first_even_number)

But it does not give me the output that I want.

Asked By: Amir

||

Answers:

Solution

# counter to identify different blocks of
# consecutive rows having same value in b
b = df['b'].diff().ne(0).cumsum()

# boolean mask to identify if the value is even
# and its the first occurrence in block
mask = (df['b'] % 2 == 0) & ~b.duplicated()

# boolean indexing to flag the True values to `x`
df.loc[mask, 'c'] = 'x'

Result

a b c
0 20 2 x
1 21 2 NaN
2 333 2 NaN
3 55 4 x
4 444 4 NaN
5 1000 4 NaN
6 900 4 NaN
7 44 3 NaN
8 100 2 x
9 200 2 NaN
10 100 6 x
Answered By: Shubham Sharma
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.