How to create a stacked percentage bar graph from a dataframe with percentage values
Question:
I have following dataframe:
Class Percentage
class1 0.215854
class2 0.12871
class3 0.122787
class4 0.0680061
class5 0.0670523
class6 0.0826716
class7 0.309828
class8 0
class9 0
How can I create a stacked vertical bar graph where y goes from 0-100% and the percentage data is plotted? I would also like to add a legend with a color corresponding to a class.
Code I’ve tried:
df.T.plot(kind='bar',stacked=True)
results in error: TypeError: Empty 'DataFrame': no numeric data to plot
classgraph,texts = plt.bar(df["Percentage"],height=5) #added texts for later legend
gave error:
Traceback (most recent call last):
File "<ipython-input-71-894dc447893f>", line 1, in <module>
classgraph,texts = plt.bar(dataframe_plot["Percentage"],height=5)
ValueError: too many values to unpack (expected 2)
I read quite some posts on how to do this but I can’t seem to figure it out.
Answers:
Well, let’s say you have this dataframe
import pandas as pd
import seaborn as sns
sns.set_style("darkgrid")
data = {'Class': ['class1', 'class2', 'class3'],
'mid-term': [345, 123, 74],
'final':[235, 345, 632]}
df = pd.DataFrame(data)
df.head()
# Class mid-term final
#0 class1 345 235
#1 class2 123 345
#2 class3 74 632
If you plot it, the results will be messy.
df.set_index('Class').T.plot(kind='bar', stacked=True)
In order to solve the problem, you need to calculate the percentage for each column. Then plot it.
df['mid-per'] = (df['mid-term'] / df['mid-term'].sum() * 100)
df['final-per'] = (df['final'] / df['final'].sum() * 100)
df.set_index('Class')[['mid-per', 'final-per']].T.plot(kind='bar', stacked=True)
I have following dataframe:
Class Percentage
class1 0.215854
class2 0.12871
class3 0.122787
class4 0.0680061
class5 0.0670523
class6 0.0826716
class7 0.309828
class8 0
class9 0
How can I create a stacked vertical bar graph where y goes from 0-100% and the percentage data is plotted? I would also like to add a legend with a color corresponding to a class.
Code I’ve tried:
df.T.plot(kind='bar',stacked=True)
results in error: TypeError: Empty 'DataFrame': no numeric data to plot
classgraph,texts = plt.bar(df["Percentage"],height=5) #added texts for later legend
gave error:
Traceback (most recent call last):
File "<ipython-input-71-894dc447893f>", line 1, in <module>
classgraph,texts = plt.bar(dataframe_plot["Percentage"],height=5)
ValueError: too many values to unpack (expected 2)
I read quite some posts on how to do this but I can’t seem to figure it out.
Well, let’s say you have this dataframe
import pandas as pd
import seaborn as sns
sns.set_style("darkgrid")
data = {'Class': ['class1', 'class2', 'class3'],
'mid-term': [345, 123, 74],
'final':[235, 345, 632]}
df = pd.DataFrame(data)
df.head()
# Class mid-term final
#0 class1 345 235
#1 class2 123 345
#2 class3 74 632
If you plot it, the results will be messy.
df.set_index('Class').T.plot(kind='bar', stacked=True)
In order to solve the problem, you need to calculate the percentage for each column. Then plot it.
df['mid-per'] = (df['mid-term'] / df['mid-term'].sum() * 100)
df['final-per'] = (df['final'] / df['final'].sum() * 100)
df.set_index('Class')[['mid-per', 'final-per']].T.plot(kind='bar', stacked=True)