How to get a stacked histogram in PairGrid or pairplot
Question:
I whish to reproduce the PairGrid plot found in that tutorial, but locally my barcharts are not stacked as in the tutorial and I can’t figure out how to make them so.
import seaborn as sns
import matplotlib.pyplot as plt # for graphics
import os
os.sys.version
# '3.6.4 (default, Sep 20 2018, 19:07:50) n[GCC 5.4.0 20160609]'
sns.__version__
# '0.9.0'
mpg = sns.load_dataset('mpg')
g = sns.PairGrid(data=mpg[["mpg", "horsepower", "weight", "origin"]], hue="origin")
g.map_upper(sns.regplot)
g.map_lower(sns.residplot)
# below for the histogram
g.map_diag(plt.hist)
# also I tried
# g.map_diag(lambda x, label, color: plt.hist(x, label=label, color=color, histtype='barstacked', alpha=.4))
# g.map_diag(plt.hist, histtype='barstacked')
# but same result
g.savefig('./Plots/mpg.svg')
Do I have to follow the second answer of this post answer suggesting that it is very tricky to do with seaborn,
or should I turn to back to plt as suggested here for a simpler chart ?
In any case I’m curious to understand how they stacked their bars in the tutorial linked above.
Answers:
The option for stacked histograms on the diagonal of a PairGrid has been removed from seaborn in this commit and hence is not available anymore in seaborn 0.9.
A workaround could be to collect all the data first and then plot it to the respective axes.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
df = sns.load_dataset('mpg')
g = sns.PairGrid(data=df[["mpg", "horsepower", "weight", "origin"]], hue="origin")
g.map_upper(sns.regplot)
g.map_lower(sns.residplot)
# below for the histograms on the diagonal
d = {}
def func(x, **kwargs):
ax = plt.gca()
if not ax in d.keys():
d[ax] = {"data" : [], "color" : []}
d[ax]["data"].append(x)
d[ax]["color"].append(kwargs.get("color"))
g.map_diag(func)
for ax, dic in d.items():
ax.hist(dic["data"], color=dic["color"], histtype="barstacked")
plt.show()
- Given the current version of
seaborn
, the accepted answer works, but is obsolete.
sns.histplot
has the parameter multiple
, which has {'layer', 'dodge', 'stack', 'fill'}
as options.
- This can be passed to
sns.pairplot
with diag_kws={'multiple': 'stack'}
.
- Use
g.map_diag(sns.histplot, multiple='stack')
with sns.PairGrid
.
seaborn
doesn’t offer an option for a stacked bar plot, like pandas
. However, this answer shows how to create a stacked bar plot with sns.histplot
.
- Tested with
matplotlib 3.5.2
& seaborn 0.12.0
sns.pairplot
import seaborn as sns
mpg = sns.load_dataset('mpg')
data = mpg[["mpg", "horsepower", "weight", "origin"]]
g = sns.pairplot(data=data, hue='origin', diag_kind='hist', diag_kws={'multiple': 'stack'})
sns.PairGrid
import seaborn as sns
mpg = sns.load_dataset('mpg')
data = mpg[["mpg", "horsepower", "weight", "origin"]]
g = sns.PairGrid(data=data, hue='origin')
g.map_upper(sns.regplot, scatter_kws=dict(linewidth=1, ec='white', s=20))
g.map_lower(sns.residplot, scatter_kws=dict(linewidth=1, ec='white', s=20))
_ = g.map_diag(sns.histplot, multiple='stack')
I whish to reproduce the PairGrid plot found in that tutorial, but locally my barcharts are not stacked as in the tutorial and I can’t figure out how to make them so.
import seaborn as sns
import matplotlib.pyplot as plt # for graphics
import os
os.sys.version
# '3.6.4 (default, Sep 20 2018, 19:07:50) n[GCC 5.4.0 20160609]'
sns.__version__
# '0.9.0'
mpg = sns.load_dataset('mpg')
g = sns.PairGrid(data=mpg[["mpg", "horsepower", "weight", "origin"]], hue="origin")
g.map_upper(sns.regplot)
g.map_lower(sns.residplot)
# below for the histogram
g.map_diag(plt.hist)
# also I tried
# g.map_diag(lambda x, label, color: plt.hist(x, label=label, color=color, histtype='barstacked', alpha=.4))
# g.map_diag(plt.hist, histtype='barstacked')
# but same result
g.savefig('./Plots/mpg.svg')
Do I have to follow the second answer of this post answer suggesting that it is very tricky to do with seaborn,
or should I turn to back to plt as suggested here for a simpler chart ?
In any case I’m curious to understand how they stacked their bars in the tutorial linked above.
The option for stacked histograms on the diagonal of a PairGrid has been removed from seaborn in this commit and hence is not available anymore in seaborn 0.9.
A workaround could be to collect all the data first and then plot it to the respective axes.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
df = sns.load_dataset('mpg')
g = sns.PairGrid(data=df[["mpg", "horsepower", "weight", "origin"]], hue="origin")
g.map_upper(sns.regplot)
g.map_lower(sns.residplot)
# below for the histograms on the diagonal
d = {}
def func(x, **kwargs):
ax = plt.gca()
if not ax in d.keys():
d[ax] = {"data" : [], "color" : []}
d[ax]["data"].append(x)
d[ax]["color"].append(kwargs.get("color"))
g.map_diag(func)
for ax, dic in d.items():
ax.hist(dic["data"], color=dic["color"], histtype="barstacked")
plt.show()
- Given the current version of
seaborn
, the accepted answer works, but is obsolete. sns.histplot
has the parametermultiple
, which has{'layer', 'dodge', 'stack', 'fill'}
as options.- This can be passed to
sns.pairplot
withdiag_kws={'multiple': 'stack'}
. - Use
g.map_diag(sns.histplot, multiple='stack')
withsns.PairGrid
.
- This can be passed to
seaborn
doesn’t offer an option for a stacked bar plot, likepandas
. However, this answer shows how to create a stacked bar plot withsns.histplot
.- Tested with
matplotlib 3.5.2
&seaborn 0.12.0
sns.pairplot
import seaborn as sns
mpg = sns.load_dataset('mpg')
data = mpg[["mpg", "horsepower", "weight", "origin"]]
g = sns.pairplot(data=data, hue='origin', diag_kind='hist', diag_kws={'multiple': 'stack'})
sns.PairGrid
import seaborn as sns
mpg = sns.load_dataset('mpg')
data = mpg[["mpg", "horsepower", "weight", "origin"]]
g = sns.PairGrid(data=data, hue='origin')
g.map_upper(sns.regplot, scatter_kws=dict(linewidth=1, ec='white', s=20))
g.map_lower(sns.residplot, scatter_kws=dict(linewidth=1, ec='white', s=20))
_ = g.map_diag(sns.histplot, multiple='stack')