How to add a mean line to a seaborn stripplot or swarmplot

Question:

I have a rather simple strip plot with vertical data.

planets = sns.load_dataset("planets")
sns.stripplot(x="method", y="distance", data=planets, size=4, color=".7")
plt.xticks(rotation=45, ha="right")
plt.show()

I want to plot the mean of each x-element (method) as a small horizontal bar similar to what you get with:

sns.boxplot(
    x="method",
    y="distance",
    data=planets,
    whis=[50, 50],
    showfliers=False,
    showbox=False,
    showcaps=False
)

But without the vertical lines (with whis=[50,50] just spots) for the first / third quartile and showing mean instead of median. Maybe there is a more elegant solution not involving a Boxplot.

Asked By: BBQuercus

||

Answers:

  • Boxplot objects are defined in matplotlib.pyplot.boxplot
    • showmeans=True
    • meanline=True makes a line instead of a marker
    • meanprops={'color': 'k', 'ls': '-', 'lw': 2} sets the color, style and width of the line.
    • medianprops={'visible': False} makes the median line not visible
    • whiskerprops={'visible': False} makes the whisker line not visible
    • zorder=10 places the line on the top layer
  • Tested in matplotlib v3.4.2 and seaborn v0.11.1
import seaborn as sns
import matplotlib.pyplot as plt

# load the dataset
planets = sns.load_dataset("planets")

p = sns.stripplot(x="method", y="distance", data=planets, size=4, color=".7")
plt.xticks(rotation=45, ha="right")
p.set(yscale='log')

# plot the mean line
sns.boxplot(showmeans=True,
            meanline=True,
            meanprops={'color': 'k', 'ls': '-', 'lw': 2},
            medianprops={'visible': False},
            whiskerprops={'visible': False},
            zorder=10,
            x="method",
            y="distance",
            data=planets,
            showfliers=False,
            showbox=False,
            showcaps=False,
            ax=p)
plt.show()

enter image description here

  • Works similarly with a seaborn.swarmplot

enter image description here

Answered By: Trenton McKinney

Here’s a solution using ax.hlines with find the mean using groupby and list comprehension:

import seaborn as sns
import matplotlib.pyplot as plt

# load the dataset
planets = sns.load_dataset("planets")

p = sns.stripplot(x="method", y="distance", data=planets, size=4, color=".7", zorder=1)
plt.xticks(rotation=45, ha="right")
p.set(yscale='log');

df_mean = planets.groupby('method', sort=False)['distance'].mean()
_ = [p.hlines(y, i-.25, i+.25, zorder=2) for i, y in df_mean.reset_index()['distance'].items()]

Output:

enter image description here

Answered By: Scott Boston

Here’s another hack that is similar to the boxplot idea but requires less overriding: draw a pointplot but with a confidence interval of width 0, and activate the errorbar "caps" to get a horizontal line with a parametrizable width:

planets = sns.load_dataset("planets")
spec = dict(x="method", y="distance", data=planets)
sns.stripplot(**spec, size=4, color=".7")
sns.pointplot(**spec, join=False, ci=0, capsize=.7, scale=0)
plt.xticks(rotation=45, ha="right")

enter image description here

One downside that is evident here is that bootstrapping gets skipped for groups with a single observation, so you don’t get a mean line there. This may or may not be a problem in an actual application.

Another trick would be to do the groupby yourself and then draw a scatterplot with a very wide vertical line marker:

planets = sns.load_dataset("planets")
variables = dict(x="method", y="distance")
sns.stripplot(data=planets, **variables, size=4, color=".7")
sns.scatterplot(
    data=planets.groupby("method")["distance"].mean().reset_index(),
    **variables, marker="|", s=2, linewidth=25
)
plt.xticks(rotation=45, ha="right")

enter image description here

Answered By: mwaskom