seaborn: barplot of a dataframe by group

Question:

I am having difficulty with this. I have the results from my initial model (`Unfiltered´), that I plot like so:

df = pd.DataFrame(
    {'class': ['foot', 'bike', 'bus', 'car', 'metro'],
     'Precision': [0.7, 0.66, 0.41, 0.61, 0.11],
     'Recall': [0.58, 0.35, 0.13, 0.89, 0.02],
     'F1-score': [0.64, 0.45, 0.2, 0.72, 0.04]}
)

groups = df.melt(id_vars=['class'], var_name=['Metric'])
sns.barplot(data=groups, x='class', y='value', hue='Metric')

To produce this nice plot:
enter image description here

Now, I obtained a second results from my improved model (filtered), so I add a column (status) to my df to indicate the results from each model like this:

df2 = pd.DataFrame(
    {'class': ['foot','foot','bike','bike','bus','bus',
               'car','car','metro','metro'],
 'Precison': [0.7, 0.62, 0.66, 0.96, 0.41, 0.42, 0.61, 0.75, 0.11, 0.3],
 'Recall': [0.58, 0.93, 0.35, 0.4, 0.13, 0.1, 0.89, 0.86, 0.02, 0.01],
 'F1-score': [0.64, 0.74, 0.45, 0.56, 0.2, 0.17, 0.72, 0.8, 0.04, 0.01],
 'status': ['Unfiltered', 'Filtered', 'Unfiltered','Filtered','Unfiltered',
           'Filtered','Unfiltered','Filtered','Unfiltered','Filtered']}
)

df2.head()
    class  Precison  Recall  F1-score   status
0   foot    0.70      0.58    0.64     Unfiltered
1   foot    0.62      0.93    0.74     Filtered
2   bike    0.66      0.35    0.45     Unfiltered
3   bike    0.96      0.40    0.56     Filtered
4   bus     0.41      0.13    0.20     Unfiltered

And I want to plot this, in similar grouping as above (i.e. foot, bike, bus, car, metro). However, for each of the metrics, I want to place the two values side-by-side. Take for example, the foot group, I would have two bars Precision[Unfiltered, filtered], then 2 bars for Recall[Unfiltered, filtered] and also 2 bars for F1-score[Unfiltered, filtered]. Likewise all other groups.

My attempt:

group2 = df2.melt(id_vars=['class', 'status'], var_name=['Metric'])
sns.barplot(data=group2, x='class', y='value', hue='Metric')

enter image description here

Totally not what I want.

Asked By: Amina Umar

||

Answers:

You can pass in hue any sequence as long as it has the same length as your data, and assign colours through it.
So you could try with

group2 = df2.melt(id_vars=['class', 'status'], var_name=['Metric'])
sns.barplot(data=group2, x='class', y='value', hue=group2[['Metric','status']].agg(tuple, axis=1))
plt.legend(fontsize=7)

But the result is a bit hard to read:
enter image description here

Answered By: Yolao_21

Seaborn grouped barplots don’t allow for multiple grouping variables. One workaround is to recode the two grouping variables (Metric and status) as one variable with 6 levels. Another possibility is to use facets. If you are open to another plotting package, I might recommend plotnine, which allows multiple grouping variables as follows:

import plotnine as p9

fig = (
    p9.ggplot(group2)
    + p9.geom_col(
        p9.aes(x="class", y="value", fill="Metric", color="Metric", alpha="status"),
        position=p9.position_dodge(1),
        size=1,
        width=0.5,
    )
    + p9.scale_color_manual(("red", "blue", "green"))
    + p9.scale_fill_manual(("red", "blue", "green"))
)

fig.draw()

This generates the following image:
enter image description here

Answered By: evces