How to find the percentage of unique rows of one column with a condition on another column?
Question:
I have a dataframe ‘merged_df’ that looks like this –
Login ID
Enable
cab001
1
cab003
0
cab002
1
cab003
0
It has many duplicates in Login ID column and in Enable column the values are 0s and 1s
Find the percentage of unique Login ID (relative to the total unique login id) that has enable value = 1.
For the above table, Unique login ID = 3
Percentage of unique Login ID that has Enable value as 1 = (2/3)*100
Answers:
You can remove the duplicates in login_id
by doing the following:
no_duplicates = merged_df.drop_duplicates(subset="login_id")
Then you can calulate the desired percentage:
percentage = (len(no_duplicates[no_duplicates["enable"] == 1]) / len(no_duplicates)) * 100
I have a dataframe ‘merged_df’ that looks like this –
Login ID | Enable |
---|---|
cab001 | 1 |
cab003 | 0 |
cab002 | 1 |
cab003 | 0 |
It has many duplicates in Login ID column and in Enable column the values are 0s and 1s
Find the percentage of unique Login ID (relative to the total unique login id) that has enable value = 1.
For the above table, Unique login ID = 3
Percentage of unique Login ID that has Enable value as 1 = (2/3)*100
You can remove the duplicates in login_id
by doing the following:
no_duplicates = merged_df.drop_duplicates(subset="login_id")
Then you can calulate the desired percentage:
percentage = (len(no_duplicates[no_duplicates["enable"] == 1]) / len(no_duplicates)) * 100