Create a stacked bar plot of percentages and annotate with count

Question:

My raw data:

I have this data (df) and I get their percentages (data=rel) and plotted a stacked bar graph.

Now I want to add values (non percentage values) to the centers of each bar but from my first dataframe.

My code for now:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from csv import reader
import seaborn as sns

df = pd.DataFrame({'IL':['Balıkesir', 'Bursa', 'Çanakkale', 'Edirne', 'İstanbul', 'Kırklareli', 'Kocaeli', 'Sakarya','Tekirdağ','Yalova'],'ENGELLIUYGUN':[7,13,3,1,142,1,14,1,2,2],'ENGELLIUYGUNDEGIL':[1,5,0,0,55,0,3,0,1,0]})
iller=df.iloc[:,[0]]

df_total = df["ENGELLIUYGUN"] + df["ENGELLIUYGUNDEGIL"]
df_rel = df[df.columns[1:]].div(df_total, 0)*100

rel=[]
rel=pd.DataFrame(df_rel)
rel['İller'] = iller

d=df.iloc[:,[1]] #I want to add these values to the center of blue bars.
f=df.iloc[:,[2]] #I want to add these values to the center of green bars.

sns.set_theme (style='whitegrid')

ax=rel.plot(x='İller',kind='bar', stacked=True, color=["#3a88e2","#5c9e1e"], label=("Uygun","Uygun Değil"))

plt.legend(["Evet","Hayır"],fontsize=8, bbox_to_anchor=(1, 0.5))

plt.xlabel('...........',fontsize=12)
plt.ylabel('..........',fontsize=12)
plt.title('.............',loc='center',fontsize=14)
plt.ylim(0,100)
ax.yaxis.grid(color='gray', linestyle='dashed')

plt.show()

I have this for now:

my plotting now

I want the exact same style of this photo:

example

I am using Anaconda-Jupyter Notebook.

Asked By: Melisa

||

Answers:

I don’t think any subtle method exist. So you have to print those yourself, adding explicitly text. Which is not that hard to do. For example, if you add this just after your plot

for i in range(len(d)):
    ax.text(i, df_rel.iloc[i,0]/2, d.iloc[i,0], ha='center', fontweight='bold', color='#ffff00', fontsize='small')
    ax.text(i, 50+df_rel.iloc[i,0]/2, f.iloc[i,0], ha='center', fontweight='bold', color='#400040', fontsize='small')

you get this result
enter image description here

You can of course change color, size, position, etc. (I am well known for by total lack of bon goût for those matter). But also decide some arbitrary rule, such as not printing ‘0’ (that the advantage of doing things explicitly: your code, your rule; you don’t have to fight an existing API to convince it to do it your way).

Answered By: chrslg
  • Answering: I want to add values (non percentage values) to the centers of each bar but from my first dataframe.
  • The correct way to annotate bars, is with .bar_label, as explained in this answer.
    • The values from df can be sent to the label= parameter instead of the percentages.
  • This answer shows how to succinctly calculate the percentages, but plots the counts and annotates with percentage and value, whereas this OP wants to plot the percentage on the y-axis and annotate with counts.
  • This answer shows how to place the legend at the bottom of the plot.
  • This answer shows how to format the axis tick labels as percent.
  • See pandas.DataFrame.plot for an explanation of the available parameters.
  • I am using Anaconda-Jupyter Notebook. Everything from the comment, # plot percent; …, should be in the same notebook cell.
  • Tested in python 3.11, pandas 1.5.2, matplotlib 3.6.2
import pandas as pd
import matplotlib.ticker as tkr

# sample data
df = pd.DataFrame({'IL': ['Balıkesir', 'Bursa', 'Çanakkale', 'Edirne', 'İstanbul', 'Kırklareli', 'Kocaeli', 'Sakarya','Tekirdağ','Yalova'],
                   'ENGELLIUYGUN': [7, 13, 3, 1, 142, 1, 14, 1, 2, 2],
                   'ENGELLIUYGUNDEGIL': [1, 5, 0, 0, 55, 0, 3, 0, 1, 0]})

# set IL as the index
df = df.set_index('IL')

# calculate the percent
per = df.div(df.sum(axis=1), axis=0).mul(100)

# plot percent; adjust rot= for the rotation of the xtick labels
ax = per.plot(kind='bar', stacked=True, figsize=(10, 8), rot=0,
              color=['#3a88e2', '#5c9e1e'], yticks=range(0, 101, 10),
              title='my title', ylabel='', xlabel='')

# move the legend
ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.05), ncol=2, frameon=False)

# format the y-axis tick labels
ax.yaxis.set_major_formatter(tkr.PercentFormatter())

# iterate through the containers
for c in ax.containers:
    
    # get the current segment label (a string); corresponds to column / legend
    col = c.get_label()
    
    # use label to get the appropriate count values from df
    # customize the label to account for cases when there might not be a bar section
    labels = [v if v > 0 else '' for v in df[col]]
    # the following will also work
    # labels = df[col].replace(0, '')
    
    # add the annotation
    ax.bar_label(c, labels=labels, label_type='center', fontweight='bold')

enter image description here

Alternate Annotation Implementation

  • Since the column names in df and per are the same, they can be extracted directly from per.
# iterate through the containers and per column names
for c, col in zip(ax.containers, per):
    
    # add the annotations with custom labels from df
    ax.bar_label(c, labels=df[col].replace(0, ''), label_type='center', fontweight='bold')
Answered By: Trenton McKinney