Add vertical lines to a layered horizontal bar chart

Question:

I am generating a multi bar chart as a horizontal bar chart, and what I need to do now is to annotate (basically create a vertical line) in each of the horizontal bars at a specific place (x-axis value based) for every y-axis, where the y-axis is categorical (names) and x-axis is numeric (integer)s.

I’ve looked in Axes.vlines but could not get it working.

import seaborn as sns
import matplotlib.pyplot as plt
crashes = sns.load_dataset("car_crashes").sort_values("total", ascending=False)
crashes['max_range'] = crashes['total'] * 0.85
sns.set_color_codes("muted")
sns.set(style="whitegrid")
sns.barplot(x="total", y="abbrev", data=crashes, label="", color="r")
sns.barplot(x="max_range", y="abbrev", data=crashes, label="", color="y")
sns.barplot(x="alcohol", y="abbrev", data=crashes,label="normal range", color="g")

#dummy data for the "vertical lines" i want to plot
crashes['actual'] = crashes['alcohol'] * 1.85

enter image description here

The code above creates a plot like this one:

https://seaborn.pydata.org/examples/horizontal_barplot.html

Now I basically want to add a vertical line in every row of the plot (so for every bar in the plot) from another column of the underlying dataframe.

Asked By: Phil S.

||

Answers:

Axes.vlines is sufficient for the job. I do this by first extracting the y points for the bar chart labels. Than I make a dictionary of the x values for those points. than I use Axes.vlines to draw a red line on the bars.

import seaborn as sns
import matplotlib.pyplot as plt

crashes = sns.load_dataset("car_crashes").sort_values("total", ascending=False)
crashes['max_range'] = crashes['total'] * 0.85
sns.set_color_codes("muted")
sns.set(style="whitegrid")
# Store the returned axes in a variable
fig, ax = plt.subplots(figsize=(12, 12))  # added for readability
sns.barplot(x="total", y="abbrev", data=crashes, label="", color="r", ax=ax)
sns.barplot(x="max_range", y="abbrev", data=crashes, label="", color="y", ax=ax)
sns.barplot(x="alcohol", y="abbrev", data=crashes,label="normal range", color="g", ax=ax)

#dummy data for the "vertical lines" i want to plot
crashes['actual'] = crashes['alcohol'] * 1.85

#### MY ADDITIONS ####

# Form dictionary of bar chart keys (i.e. Y axis data, here it is "abbrev") to
# corresponding y and x points
y_labs = list(ax.get_yticklabels())
y_tic_pos = list(ax.get_yticks())
y_tick_vals = {}
for i in range(len(y_tic_pos)):
    y_tick_vals[y_labs[i].get_text()] = y_tic_pos[i]
x_points = {lab:crashes[crashes["abbrev"] == lab]["actual"].values[0] for lab in y_tick_vals}

# for each of the relevant y axis, draw a vertical line
for key in y_tick_vals:
    c_y = y_tick_vals[key]
    c_x = x_points[key]
    # I just did some trial and error to find out that each bar is 0.5 wide;
    # this may not be the case for other plots.
    c_ymin = c_y - 0.25
    c_ymax = c_y + 0.25

    ax.vlines(c_x, c_ymin, c_ymax, colors="r")

plt.show()

enter image description here

Answered By: SpaceMonkey55

Imports and Sample Data

  • Tested in python 3.11.2, pandas 2.0.0, matplotlib 3.7.1, seaborn 0.12.2
import seaborn as sns
import matplotlib.pyplot as plt

# sample data
crashes = sns.load_dataset("car_crashes").sort_values("total", ascending=False, ignore_index=True)
crashes['max_range'] = crashes['total'] * 0.85
crashes['actual'] = crashes['alcohol'] * 1.85

# long data
cols = ['total', 'max_range', 'alcohol']
df = crashes.melt(id_vars='abbrev', value_vars=cols)

Layered Bars vs. Stacked Bars

  • The bars in the OP are layered in the z-direction, not stacked end-to-end. Layering bars in this way is confusing, and is not a standard way to interpret a stacked bar plot.
    • This is important to note, when trying to plot stacked bars.
    • This issue is perpetuated in this answer.
  • sns.barplot does not support stacked bars.
  • See matplotlib: Stacked bar chart, where bottom= is correctly used, to plot stacked bars manually with plt.bar/ax.bar. Otherwise use pandas.DataFrame.plot with stcked=True.
# create the figure with subplots
fig, ((ax1, ax2, ax3), (ax4, ax5, ax6)) = plt.subplots(2, 3, figsize=(12, 15), sharex=True, tight_layout=True)

# individual bars
sns.barplot(x="total", y="abbrev", data=crashes, color="r", label='total', ax=ax1)
ax1.legend()
sns.barplot(x="max_range", y="abbrev", data=crashes, color="y", label='max_range', ax=ax2)
ax2.legend()
sns.barplot(x="alcohol", y="abbrev", data=crashes, color="g", label='alcohol', ax=ax3)
ax3.legend()

# layered bars - not stacked
sns.barplot(x="total", y="abbrev", data=crashes, label="total", color="r", ax=ax4)
sns.barplot(x="max_range", y="abbrev", data=crashes, label="max_range", color="y", ax=ax4)
sns.barplot(x="alcohol", y="abbrev", data=crashes, label="alcohol", color="g", ax=ax4)
ax4.set_title('These bars are layered in the z-direction')
ax4.legend()

# stacked bars with sns.histplot, not sns.barplot
sns.histplot(data=df, y='abbrev', weights='value', hue='variable', multiple='stack', hue_order=cols,
             palette=['r', 'y', 'g'], ax=ax5)
ax5.set_title('These bars are stacked')

# stacked bars with pandas.DataFrame.plot
crashes.plot(kind='barh', x='abbrev', y=['alcohol', 'max_range', 'total'], stacked=True, width=0.8,
             color=['g', 'y', 'r'], title='These bars are stacked', ax=ax6)
ax6.invert_yaxis()

enter image description here

Vertical Lines on Stacked Bars

  • Use .vlines to plot the vertical lines on the stacked bars.
    • array-like objects are accepted for x, ymin, and ymax.
    • See How to draw vertical lines on a given plot for a thorough explanation of .vlines.
    • Because the bars are correctly stacked, it may be necessary to add the values for the bottom stack ('alcohol') to 'actual' to determine the correct values for x=.
      • x=crashes.actual.add(crashes.alcohol)
  • Plot the stacked bars directly with pandas.DataFrame.plot.
    • Setting width to a known value will help to calculate the edges of the bars for the vertical lines.
    • The bars will directly match the dataframe rows, which can be passed to x= when using .vlines.
# manually set the width 
width = 0.75

# plot the DataFrame
ax = crashes.plot(kind='barh', x='abbrev', y=['alcohol', 'max_range', 'total'], figsize=(12, 12), stacked=True, color=['g', 'y', 'r'], width=width)

# flip the order of the yaxis
ax.invert_yaxis()

# divide the bar width by two and calculate the edges of the bar
ymax, ymin = zip(*[(y + width/2, y - width/2) for y in ax.get_yticks()])

# plot the vertical lines
_ = ax.vlines(x=crashes.actual.add(crashes.alcohol), ymin=ymin, ymax=ymax, color='k')

enter image description here

Vertical Lines on Layered Bars

  • Still use pandas.DataFrame.plot.
fig, ax = plt.subplots(figsize=(12, 12))

width = 0.75
for col, color in zip(['total', 'max_range', 'alcohol', ], ['r', 'y', 'g']):
    crashes.plot(kind='barh', x='abbrev', y=col, color=color, width=width, ax=ax, title='Layered Bar Plot - Not Stacked')
ax.invert_yaxis()

ymax, ymin = zip(*[(y + width/2, y - width/2) for y in ax.get_yticks()])
_ = ax.vlines(x=crashes.actual, ymin=ymin, ymax=ymax, color='k')

enter image description here

Answered By: Trenton McKinney