How to calculate percentages using Pandas groupby

Question:

I have 3 users s1 who has 10 dollars, s2 10,20 dollars, and s3 20,20,30 dollars. I want to calculate percentage of users who had 10, 20 and 30 dollars. Is my interpretation correct here?

input

import pandas as pd
df1 = (pd.DataFrame({'users': ['s1', 's2', 's2', 's3', 's3', 's3'],
              'dollars': [10,10,20,20,20,30]}))

output

% of subjects who had 10 dollors        0.4
% of subjects who had 20 dollors        0.4
% of subjects who had 30 dollors        0.2

tried

df1.groupby(['dollars']).agg({'dollars': 'sum'}) / df1['dollars'].sum() * 100
Asked By: ferrelwill

||

Answers:

to get the percentage of users that have each kind of bill you can use a crosstab:

out = pd.crosstab(df1['users'], df1['dollars']).gt(0).mean().mul(100)

output:

dollars
10    66.666667
20    66.666667
30    33.333333
dtype: float64

If you want normalized counts:

out/out.sum()

Output:

dollars
10    0.4
20    0.4
30    0.2
dtype: float64
Answered By: mozway

Use DataFrameGroupBy.nunique for count unique users per dollars, divide number of unique dollars and last divide sum:

out = df1.groupby('dollars')['users'].nunique().div(df1['dollars'].nunique())
out = out / out.sum()

print (out)
dollars
10    0.4
20    0.4
30    0.2
Name: users, dtype: float64
Answered By: jezrael
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.