How to plot points over a violin plot?

Question:

I have four pandas Series and I plot them using a violin plot as follows:

import seaborn
seaborn.violinplot([X1['total'], X2['total'], X3['total'], X4['total']])

I would like to plot the values on top of the violin plot so I added:

seaborn.stripplot([X1['total'], X2['total'], X3['total'], X4['total']])

But this gives:

enter image description here

It plots all the points over the first violin plot.

What am I doing wrong?

Asked By: Simd

||

Answers:

To plot the values on top of the violin plot, you can use the swarmplot function from the seaborn library. This function will overlay a scatterplot on top of the violin plot, with the points representing the individual data points in each series.

  import seaborn

# Plot the violin plot
seaborn.violinplot([X1['total'], X2['total'], X3['total'], X4['total']])

# Overlay the swarmplot
seaborn.swarmplot([X1['total'], X2['total'], X3['total'], X4['total']], color='k')

This will create a violin plot with the data from the four pandas Series, and then overlay a scatterplot on top of the violin plot showing the individual data points.

You can customize the appearance of the violin plot and the swarmplot by using various parameters of the violinplot and swarmplot functions. For example, you can use the inner parameter of the violinplot function to control the appearance of the box inside the violins, or you can use the size parameter of the swarmplot function to control the size of the points in the scatterplot.

Answered By: Ilak

Currently (seaborn 0.12.1), sns.violinplot seems to accept a list of lists as data, and interprets it similar to a wide form dataframe. sns.striplot (as well as sns.swarmplot), however, interpret this as a single dataset.

On the other hand, sns.stripplot accepts a dictionary of lists and interprets it as a wide form dataframe. But sns.violinplot refuses to work with that dictionary.

Note that seaborn is being actively reworked internally to allow a wider set of data formats, so one of the future versions will tackle this issue.

So, a list of lists for the violin plot, and a dictionary for the stripplot allows combining both:

import seaborn as sns
import pandas as pd
import numpy as np

X1, X2, X3, X4 = [pd.DataFrame({'total': np.random.normal(.1, 1, np.random.randint(99, 300)).cumsum()})
                  for _ in range(4)]

ax = sns.violinplot([X1['total'], X2['total'], X3['total'], X4['total']], inner=None)
sns.stripplot({0: X1['total'], 1: X2['total'], 2: X3['total'], 3: X4['total']},
              edgecolor='black', linewidth=1, palette=['white'] * 4, ax=ax)

combining sns.violinplot and sns.stripplot

Answered By: JohanC