How can I add a column to calculate whether there is at least one record?
Question:
I have a dataframe ‘df’ like this:
user_id record
0 a No
1 a No
2 a Yes
3 b No
4 b No
5 c Yes
6 c Yes
Each row means a record of a users operation. The column ‘record’ means whether the operation is illegal. Now I want to add a column to show whether a user has illegal operation. The result should be:
user_id record history
0 a No 1
1 a No 1
2 a Yes 1
3 b No 0
4 b No 0
5 c Yes 1
6 c Yes 1
Once the user has at least 1 illegal operation, all the ‘history’ should be 1. How can I get this?
Answers:
I have tried one way.
df['history'] = df.groupby('user_id').transform(lambda x: int('Yes' in x.values))
This can solve the problem. But I think it’s not a smart idea.
This might be a bit faster than your solution:
df['history'] = df['record'].eq('Yes').groupby(df['user_id']).transform('any')
I have a dataframe ‘df’ like this:
user_id record
0 a No
1 a No
2 a Yes
3 b No
4 b No
5 c Yes
6 c Yes
Each row means a record of a users operation. The column ‘record’ means whether the operation is illegal. Now I want to add a column to show whether a user has illegal operation. The result should be:
user_id record history
0 a No 1
1 a No 1
2 a Yes 1
3 b No 0
4 b No 0
5 c Yes 1
6 c Yes 1
Once the user has at least 1 illegal operation, all the ‘history’ should be 1. How can I get this?
I have tried one way.
df['history'] = df.groupby('user_id').transform(lambda x: int('Yes' in x.values))
This can solve the problem. But I think it’s not a smart idea.
This might be a bit faster than your solution:
df['history'] = df['record'].eq('Yes').groupby(df['user_id']).transform('any')