How can I create a plot to visualize the 68–95–99.7 rule?

Question:

I’ve created a plot of normal distribution like this:

fig, ax = plt.subplots()
ax.set_title('Плотнось распределения вероятности')
ax.set_xlabel('x')
ax.set_ylabel('f(x)')
x = np.linspace(148, 200, 100) # X от 148 до 200
y = (1 / (5 * math.sqrt(2*math.pi))) * np.exp((-(x-178)**2) / (2*5**2))
ax.plot(x, y)
plt.show()

But I also need to add vertical lines inside the graph area, color inner segments and add marks like in picture on axis = 0.

what i need

How can I do it in python using matplotlib?

I’ve tried to use plt.axvline, but the vertical lines go outside of my main plot:

plt.axvline(x = 178, color = 'g', label = 'axvline - full height')
plt.axvline(x = 178+5, color = 'b', label = 'axvline - full height')
plt.axvline(x = 178-5, color = 'b', label = 'axvline - full height')
plt.axvline(x = 178+5*2, color = 'r', label = 'axvline - full height')
plt.axvline(x = 178-5*2, color = 'r', label = 'axvline - full height')

enter image description here

Asked By: Nadezhda GR

||

Answers:

The line version can be implemented using vlines, but note that your reference figure can be better reproduced using fill_between.


Line version

Instead of axvline, use vlines which supports ymin and ymax bounds.

Change your y into a lambda f(x, mu, sd) and use that to define the ymax bounds:

# define y as a lambda f(x, mu, sd)
f = lambda x, mu, sd: (1 / (sd * (2*np.pi)**0.5)) * np.exp((-(x-mu)**2) / (2*sd**2))

fig, ax = plt.subplots(figsize=(8, 3))
x = np.linspace(148, 200, 200)
mu = 178
sd = 5
ax.plot(x, f(x, mu, sd))

# define 68/95/99 locations and colors
xs = mu + sd*np.arange(-3, 4)
colors = [*'yrbgbry']

# draw lines at 68/95/99 points from 0 to the curve
ax.vlines(xs, ymin=0, ymax=[f(x, mu, sd) for x in xs], color=colors)

# relabel x ticks
plt.xticks(xs, [f'${n}sigma$' if n else '0' for n in range(-3, 4)])

line version


Shaded version

Use fill_between to better recreate the sample figure. Define the shaded bounds using the where parameter:

fig, ax = plt.subplots(figsize=(8, 3))
x = np.linspace(148, 200, 200)
mu = 178
sd = 5
y = (1 / (sd * (2*np.pi)**0.5)) * np.exp((-(x-mu)**2) / (2*sd**2))
ax.plot(x, y)

# use `where` condition to shade bounded regions
bounds = mu + sd*np.array([-np.inf] + list(range(-3, 4)) + [np.inf])
alphas = [0.1, 0.2, 0.5, 0.8, 0.8, 0.5, 0.2, 0.1]
for left, right, alpha in zip(bounds, bounds[1:], alphas):
    ax.fill_between(x, y, where=(x >= left) & (x < right), color='b', alpha=alpha)

# relabel x ticks
plt.xticks(bounds[1:-1], [f'${n}sigma$' if n else '0' for n in range(-3, 4)])

shaded version

To label the region percentages, add text objects at the midpoints of the bounded regions:

midpoints = mu + sd*np.arange(-3.5, 4)
percents = [0.1, 2.1, 13.6, 34.1, 34.1, 13.6, 2.1, 0.1]
colors = [*'kkwwwwkk']
for m, p, c in zip(
    midpoints, # midpoints of bounded regions
    percents,  # percents captured by bounded regions
    colors,    # colors of text labels
):
    ax.text(m, 0.01, f'{p}%', color=c, ha='center', va='bottom')

shaded version labeled

Answered By: tdy