Combine Binned barplot with lineplot

Question:

I’d like to represent two datasets on the same plot, one as a line as one as a binned barplot. I can do each individually:

tobar = pd.melt(pd.DataFrame(np.random.randn(1000).cumsum()))
tobar["bins"] = pd.qcut(tobar.index, 20)

bp = sns.barplot(data=tobar, x="bins", y="value")

barplot by itself

toline = pd.melt(pd.DataFrame(np.random.randn(1000).cumsum()))

lp = sns.lineplot(data=toline, x=toline.index, y="value")

lineplot by itself

But when I try to combine them, of course the x axis gets messed up:

fig, ax = plt.subplots()
ax2 = ax.twinx()
bp = sns.barplot(data=tobar, x="bins", y="value", ax=ax)
lp = sns.lineplot(data=toline, x=toline.index, y="value", ax=ax2)
bp.set(xlabel=None)

failed attempt at combining them

I also can’t seem to get rid of the bin labels.

How can I get these two informations on the one plot?

Asked By: Whitehot

||

Answers:

  • This answer explains why it’s better to plot the bars with matplotlib.axes.Axes.bar instead of sns.barplot or pandas.DataFrame.bar.
    • In short, the xtick locations correspond to the actual numeric value of the label, whereas the xticks for seaborn and pandas are 0 indexed, and don’t correspond to the numeric value.
  • This answer shows how to add bar labels.
  • ax2 = ax.twinx() can be used for the line plot if needed
  • Works the same if the line plot is different data.
  • Tested in python 3.11, pandas 1.5.2, matplotlib 3.6.2, seaborn 0.12.1

Imports and DataFrame

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

# test data
np.random.seed(2022)
df = pd.melt(pd.DataFrame(np.random.randn(1000).cumsum()))

# create the bins
df["bins"] = pd.qcut(df.index, 20)

# add a column for the mid point of the interval
df['mid'] = df.bins.apply(lambda row: row.mid.round().astype(int))

# pivot the dataframe to calculate the mean of each interval
pt = df.pivot_table(index='mid', values='value', aggfunc='mean').reset_index()

Plot 1

# create the figure
fig, ax = plt.subplots(figsize=(30, 7))

# add a horizontal line at y=0 
ax.axhline(0, color='black')

# add the bar plot
ax.bar(data=pt, x='mid', height='value', width=4, alpha=0.5)

# set the labels on the xticks - if desired
ax.set_xticks(ticks=pt.mid, labels=pt.mid)

# add the intervals as labels on the bars - if desired
ax.bar_label(ax.containers[0], labels=df.bins.unique(), weight='bold')

# add the line plot
_ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

enter image description here

Plot 2

fig, ax = plt.subplots(figsize=(30, 7))

ax.axhline(0, color='black')

ax.bar(data=pt, x='mid', height='value', width=4, alpha=0.5)

ax.set_xticks(ticks=pt.mid, labels=df.bins.unique(), rotation=45)

ax.bar_label(ax.containers[0], weight='bold')

_ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

enter image description here

Plot 3

  • The bar width is the width of the interval
fig, ax = plt.subplots(figsize=(30, 7))

ax.axhline(0, color='black')

ax.bar(data=pt, x='mid', height='value', width=50, alpha=0.5, ec='k')

ax.set_xticks(ticks=pt.mid, labels=df.bins.unique(), rotation=45)

ax.bar_label(ax.containers[0], weight='bold')

_ = sns.lineplot(data=df, x=df.index, y="value", ax=ax, color='tab:orange')

enter image description here

Answered By: Trenton McKinney