How to make a multi-level chart column label by hue

Question:

This is a continuation of this question. But now I have a bar-chart with hue.

Here’s what I have:

df = pd.DataFrame({'age': ['20-30', '20-30', '20-30', '30-40', '30-40', '30-40', '40-50', '40-50', '40-50', '50-60', '50-60', '50-60'],
               'expenses':['50$', '100$', '200$', '50$', '100$', '200$', '50$', '100$', '200$', '50$', '100$', '200$'],
               'users': [59, 42, 57, 68, 47, 98, 75, 73, 54, 81, 52, 43],
               'buyers': [22, 35, 18, 27, 12, 57, 19, 29, 31, 47, 10, 5],
               'percentage': [37.2881, 83.3333, 31.5789, 39.7058, 25.5319, 58.1632, 25.3333, 39.7260, 57.4074, 58.0246, 19.2307, 11.6279]})
index age expenses users buyers percentage
0 20-30 50$ 59 22 37.2881
1 20-30 100$ 42 35 83.3333
2 20-30 200$ 57 18 31.5789
3 30-40 50$ 68 27 39.7058
4 30-40 100$ 47 12 25.5319
5 30-40 200$ 98 57 58.1632
6 40-50 50$ 75 19 25.3333
7 40-50 100$ 73 29 39.726
8 40-50 200$ 54 31 57.4074
9 50-60 50$ 81 47 58.0246
10 50-60 100$ 52 10 19.2307
11 50-60 200$ 43 5 11.6279
fig, ax = plt.subplots(figsize=(20, 10))

# Plot the all users
sns.barplot(x='age', y='users', data=df, hue='expenses', palette='Blues', edgecolor='grey', alpha=0.7, ax=ax)
# Plot the buyers
sns.barplot(x='age', y='buyers', data=df, hue='expenses', palette='Blues', edgecolor='darkgrey', hatch='//', ax=ax)

plt.show()

enter image description here

I need to get the same chart. In the case of hue, the code:

# extract the separate containers
c1, c2 = ax.containers

# annotate with the users values
ax.bar_label(c1, fontsize=13)

# annotate with the buyer and percentage values
l2 = [f"{v.get_height()}: {df.loc[i, 'percentage']}%" for i, v in enumerate(c2)]
ax.bar_label(c2, labels=l2, fontsize=8, label_type='center', fontweight='bold')

no longer works.
I would be glad for any hints.

Asked By: Kate

||

Answers:

  • Each object in ax.containers represents the bars for a single hue group.
    • When using bar_label, the annotations for each bar in '50$', then '100$', and then '200$' are added.
  • I think it’s easier to select the correct data by annotating the 'buyers' group separately.
    • The answer to your previous question selects the data from the entire dataframe, but here Boolean indexing is used to select only a segment of the dataframe. Using print(data) in each loop will help with understanding.
fig, ax = plt.subplots(figsize=(20, 10))

# plot the all users
sns.barplot(x='age', y='users', data=df, hue='expenses', palette='Blues', edgecolor='grey', alpha=0.7, ax=ax)

# annotate the bars in the 3 containers (1 container per hue group)
for c in ax.containers:
    ax.bar_label(c)
    
# plot the 'buyers', which adds 3 more containers to ax
sns.barplot(x='age', y='buyers', data=df, hue='expenses', palette='Blues', edgecolor='darkgrey', hatch='//', ax=ax)

# iterate through the last 3 new containers containing the hatched groups 
for c in ax.containers[3:]:
    
    # get the hue label, which will be used to select the data group
    hue_label = c.get_label()
    # select the data based on hue_label
    data = df.loc[df.expenses.eq(hue_label), ['buyers', 'percentage']]
    # customize the labels
    labels = [f"{v.get_height()}: {data.iloc[i, 1]:0.2f}%" for i, v in enumerate(c)]
    # add the labels
    ax.bar_label(c, labels=labels)

plt.show()

enter image description here

Answered By: Trenton McKinney