How to make dots in Swarmplot (Seaborn) overlap with each other?

Question:

I have made a swarmplot with seaborn, but I can’t seem to find the option to make the dots overlap with each other.

They overlap with each other, but only at the sides.

I would like them to make overlap everywhere when they would not be able fit, but now they only overlap at the sides.

data = sns.load_dataset('iris')
sns.swarmplot(data=data, y="sepal_length", x="species", edgecolor="black",alpha=.5, s=15,linewidth=1.0)

enter image description here

Asked By: Hielke Walinga

||

Answers:

I don’t think it’s possible to let the markers overlap deliberately with swarmplot. Of course smaller markers would not overlap at all, if that is desired.

Else a hacky wordaround is to use the fact that seaborn hardcodes the distance between markers for a specific figure size. Hence when plotting on a huge figure, where no overlap happens, but then making the figure smaller afterwards, overlapp should be pretty high.

import seaborn as sns
import matplotlib.pyplot as plt

data = sns.load_dataset('iris')
fig, ax = plt.subplots(figsize=(19,4.8))
sns.swarmplot(data=data, y="sepal_length", x="species", 
                   edgecolor="black",alpha=.5, s=15,linewidth=1.0, ax=ax)
fig.set_size_inches(6.4,4.8)

plt.show()

enter image description here

Here you would need to find good values for the figsize, such that you’re happy with the result.

You could also use a stripplot instead of a swarmplot. As far as I know, the whole point of swarmplot is to have a ouput similar to stripplot but where the points don’t overlay.

data = sns.load_dataset('iris')
sns.stripplot(data=data, y="sepal_length", x="species", edgecolor="black",alpha=.5, s=15,linewidth=1.0)

enter image description here

In addition, you can control the amount of overlap using the jitter= keyword

Answered By: Diziet Asahi

Another workaround to have both clustering according to the distribution (not possible with stripplot) and overlap between items (and thus speed) is to define a custom density_jitter function:

def density_jitter(values, width=1.0, cluster_factor=1.0):
    """
    Add jitter to a 1D array of values, using a kernel density estimate
    """
    inds = np.arange(len(values))
    np.random.shuffle(inds)
    values = values[inds]
    N = len(values)
    nbins = 100
    quant = np.round(nbins * (values - np.min(values)) / (np.max(values) - np.min(values) + 1e-8))
    inds = np.argsort(quant + np.random.randn(N) * 1e-6)
    layer = 0
    last_bin = -1
    ys = np.zeros(N)
    for ind in inds:
        if quant[ind] != last_bin:
            layer = 0
        ys[ind] = cluster_factor * (np.ceil(layer / 2) * ((layer % 2) * 2 - 1))
        layer += 1
        last_bin = quant[ind]
    ys *= 0.9 * (width / np.max(ys + 1))

    return ys


data = sns.load_dataset('iris')

for ind, species in enumerate(data.species.unique()):
    ys = density_jitter(data[data.species == species].sepal_length.values, width=0.4, cluster_factor=0.2)
    plt.scatter(ind + ys, data[data.species == species].sepal_length.values, alpha=0.3, color=plt.cm.tab10(ind))
plt.xticks(np.arange(3), data.species.unique())
plt.show()

example

Answered By: MonsieurWave