How can I add a column to calculate whether there is at least one record?

Question:

I have a dataframe ‘df’ like this:

   user_id  record
0        a      No
1        a      No
2        a     Yes
3        b      No
4        b      No
5        c     Yes
6        c     Yes

Each row means a record of a users operation. The column ‘record’ means whether the operation is illegal. Now I want to add a column to show whether a user has illegal operation. The result should be:

   user_id  record history
0        a      No       1
1        a      No       1
2        a     Yes       1
3        b      No       0
4        b      No       0
5        c     Yes       1
6        c     Yes       1

Once the user has at least 1 illegal operation, all the ‘history’ should be 1. How can I get this?

Asked By: xiaoluohao

||

Answers:

I have tried one way.

df['history'] = df.groupby('user_id').transform(lambda x: int('Yes' in x.values))

This can solve the problem. But I think it’s not a smart idea.

Answered By: xiaoluohao

This might be a bit faster than your solution:

df['history'] = df['record'].eq('Yes').groupby(df['user_id']).transform('any')
Answered By: Quang Hoang
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.