# Percentages a single column's values in separate columns

## Question:

For the following dataframe:

``````   person  choice
0  A       1
1  A       2
2  A       1
3  B       3
4  B       3
5  B       2
6  B       1
7  C       2
``````

how can I find the percentage of each choice per person?

The output should be something like the following:

``````person  choice_1_count choice_2_count choice_3_count  total
A                    2              1              0      3
B                    1              1              2      4
C                    0              1              0      1
``````

to be used to find percentages:

``````person  choice_1_percent  choice_2_percent  choice_3_percent
A                  66.67             33.33              0.00
B                  25.00             25.00             50.00
C                   0.00            100.00              0.00
``````

The format of the final dataframe matters, for example in sorting and plotting the percentage columns, and further analysis.

``````df = pd.DataFrame(df.value_counts(['person', 'choice']).sort_index(), columns=["count"])
df["percent"] = df["count"]/df.groupby('person')['count'].transform('sum')
``````

Lets use `crosstab` to calculate frequency table and `normalize` across `index` axis to calculate percentages

``````dist = pd.crosstab(df['person'], df['choice'], normalize='index') * 100
``````

Result

``````choice          1           2     3
person
A       66.666667   33.333333   0.0
B       25.000000   25.000000  50.0
C        0.000000  100.000000   0.0
``````

Then you can plot the percentages

``````dist.plot(kind='bar')
``````

Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.