How to find the percentage of unique rows of one column with a condition on another column?

Question:

I have a dataframe ‘merged_df’ that looks like this –

Login ID Enable
cab001 1
cab003 0
cab002 1
cab003 0

It has many duplicates in Login ID column and in Enable column the values are 0s and 1s

Find the percentage of unique Login ID (relative to the total unique login id) that has enable value = 1.

For the above table, Unique login ID = 3

Percentage of unique Login ID that has Enable value as 1 = (2/3)*100

Asked By: Atif

||

Answers:

You can remove the duplicates in login_id by doing the following:

no_duplicates = merged_df.drop_duplicates(subset="login_id")

Then you can calulate the desired percentage:

percentage = (len(no_duplicates[no_duplicates["enable"] == 1]) / len(no_duplicates)) * 100
Answered By: DPM
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.