adding a percentage column to a dataframe in python

Question:

I’m sorry I’m very new to python.
I have a dataset "olympics games":
dataset and columns

olympics.isnull().sum
ID             0
Name           0
Sex            0
Age         9315
Height     58814
Weight     61527
Team           0
NOC            0
Games          0
Year           0
Season         0
City           0
Sport          0
Event          0
Medal     229959
dtype: int64

and I have created a dataframe that shows the number of athletics grouped by ‘Sex’ for the USA team

sex_counts_usa = pd.DataFrame(team_usa.groupby('Sex').count()['ID']).sort_values(by = 'Sex', ascending = False)

how can I add to this dataframe a new column to show the same results but as percentages?

many thanks in advance

Asked By: Adam

||

Answers:

Try this

# count athletes by sex
sex_counts_usa = team_usa['Sex'].value_counts().to_frame('Count')
# percentage of athletes by sex
sex_counts_usa['Percentage'] = (sex_counts_usa / sex_counts_usa.sum() * 100).astype('string') + '%'

If the aim is only to count by some column such as Sex, it’s better to use .value_counts() rather than .groupby('Sex').count()['ID'] in my opinion.

value_counts() can be called twice, once for the counts and again (with normalize=True parameter) for the percentages.

result = pd.DataFrame({
    'Count': team_usa['Sex'].value_counts(), 
    'Percentage': team_usa['Sex'].value_counts(normalize=True)
})
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.