Python – Seaborn kdeplot set point mark

Question:

I’m trying to do something like this (Draw a point at the mean peak of a distplot or kdeplot in Seaborn). A’m need to mark point with value 13.72.

but:
list index out of range

string: x = ax.lines[0].get_xdata()

int_rate = df['int_rate']
ax = sns.kdeplot(int_rate, shade = True)

x = ax.lines[0].get_xdata()
y = ax.lines[0].get_ydata()
maxid = np.where(x == 13.72)
plt.plot(x[maxid],y[maxid], 'bo', ms=10)
Asked By: mraklbrw

||

Answers:

The problem is with setting shading = True seaborn plots a matplotlib.PolyCollection instead of a matplotlib.lines.Line2D object, so ax.lines is an empty list (ax does not contain any lines).

You could set shading to False and follow your given example and it will work, or if you want to keep the shading and still plot the highest point by accessing coordinates of that point you would need to get it from the PolyCollection object.

As per this stackoverflow question you can do so via get_paths() method of PolyCollection.

If you replace your code:

x = ax.lines[0].get_xdata()
y = ax.lines[0].get_ydata()
maxid = np.where(x == 13.72)
plt.plot(x[maxid],y[maxid], 'bo', ms=10)

with:

x, y = ax.get_children()[0].get_paths()[0].vertices.T
maxid = y.argmax()
plt.plot(x[maxid], y[maxid], 'bo', ms=10)

You will get shaded kde plot with the highest point marked:

enter image description here

Note: data used is tips dataset from seaborn.

Edit: Seeing as you need to mark a specific x value, and both PolyCollection.get_paths() and ax.lines[0].get_xdata() don’t necessarily return the precise x values contained in the dataset plotted, you might want to try rounding these arrays before looking for the index by np.where(np.round(x,2) == 13.72)

Answered By: dm2

100 years latter. Actually figured out a way to this, kinda janky, but it works. Because it is a distribution, the points don’t match perfectly. But for plotting purposes, you can find the nearest point:

import numpy as np
def find_nearest(array, value):
    array = np.asarray(array)
    idx = (np.abs(array - value)).argmin()
    return array[idx]

And then you will find the nearest point to that True X in the lines. For plotting purposes, it should be close enough:

int_rate = df['int_rate']
ax = sns.kdeplot(int_rate, shade = True)

x = ax.lines[0].get_xdata()
y = ax.lines[0].get_ydata()


points = list(zip(x, y))
t_dic = dict(points)

true_x = 13.72

x_point = find_nearest(np.array(list(t_dic.keys())), true_x)

sns.scatterplot(x = [x_point],
                y = [t_dic[x_point]])
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.