Python (matplotlib) equivalent of stacked bar chart in R (ggplot)
Question:
I am looking for an equivalent in python (matplotlib) of the following stacked bar chart created in R (ggplot):
The dummy data (in R) looks like this:
seasons <- c("Winter", "Winter", "Winter", "Spring", "Spring", "Spring", "Summer", "Summer", "Summer", "Fall", "Fall", "Fall")
feelings <- c("Cold", "Cold", "Cold", "Warm", "Warm", "Cold", "Warm", "Warm", "Warm", "Warm", "Cold", "Cold")
survey <- data.frame(seasons, feelings)
In R I can create the chart I am looking for with the following one-liner:
ggplot(survey, aes(x=seasons, fill=feelings)) + geom_bar(position = "fill")
It looks like this:
How can I create this chart in python (preferably with matplotlib) in an easy and compact way?
I found some (almost) fitting solutions but they were all rather complicated and far away from a one-liner. Or is this not possible in python (matplotlib)?
Answers:
Step 1. Prepare your data
df = pd.DataFrame(
{
"seasons":["Winter", "Winter", "Winter", "Spring", "Spring", "Spring", "Summer", "Summer", "Summer", "Fall", "Fall", "Fall"],
"feelings":["Cold", "Cold", "Cold", "Warm", "Warm", "Cold", "Warm", "Warm", "Warm", "Warm", "Cold", "Cold"]
}
)
df_new = df.pivot_table(columns="seasons", index="feelings", aggfunc=len, fill_value=0).T.apply(lambda x: x/sum(x), axis=1)
df_new
feelings Cold Warm
seasons
Fall 0.666667 0.333333
Spring 0.333333 0.666667
Summer 0.000000 1.000000
Winter 1.000000 0.000000
Step 2. Plot your data
ax = df_new.plot.bar(stacked=True)
ax.set_xticklabels(ax.get_xticklabels(), rotation=0)
plt.style.use('ggplot')
plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.5), title="feelings", framealpha=0);
If you’re not wed to matplotlib and really prefer ggplot, then you can just use the plotnine
library which is a ggplot clone in Python. The syntax is near identical:
import pandas as pd
from plotnine import *
survey = pd.DataFrame({
'seasons': ['Winter', 'Winter', 'Winter', 'Spring', 'Spring', 'Spring', 'Summer', 'Summer', 'Summer', 'Fall', 'Fall', 'Fall'],
'feelings': ['Cold', 'Cold', 'Cold', 'Warm', 'Warm', 'Cold', 'Warm', 'Warm', 'Warm', 'Warm', 'Cold', 'Cold'],
})
(
ggplot(survey, aes(x='seasons', fill='feelings'))
+ geom_bar(position = 'fill')
)
As is the output:
Here is a one-liner using seaborn (based on matplotlib and can be further customized with matplotlib calls).
import pandas as pd
import seaborn as sns
df = pd.DataFrame(
{
"seasons":["Winter", "Winter", "Winter", "Spring", "Spring", "Spring", "Summer", "Summer", "Summer", "Fall", "Fall", "Fall"],
"feelings":["Cold", "Cold", "Cold", "Warm", "Warm", "Cold", "Warm", "Warm", "Warm", "Warm", "Cold", "Cold"]
}
)
sns.histplot(df, x="seasons", hue="feelings", multiple="fill")
As a previous R person that now uses mostly python, I had also been searching for an easier way to do this without relying on plotnine
and instead using plotting libraries more native to the python community.
Hope this helps!
Packages used:
- pandas==1.5.2
- seaborn==0.12.2
I am looking for an equivalent in python (matplotlib) of the following stacked bar chart created in R (ggplot):
The dummy data (in R) looks like this:
seasons <- c("Winter", "Winter", "Winter", "Spring", "Spring", "Spring", "Summer", "Summer", "Summer", "Fall", "Fall", "Fall")
feelings <- c("Cold", "Cold", "Cold", "Warm", "Warm", "Cold", "Warm", "Warm", "Warm", "Warm", "Cold", "Cold")
survey <- data.frame(seasons, feelings)
In R I can create the chart I am looking for with the following one-liner:
ggplot(survey, aes(x=seasons, fill=feelings)) + geom_bar(position = "fill")
It looks like this:
How can I create this chart in python (preferably with matplotlib) in an easy and compact way?
I found some (almost) fitting solutions but they were all rather complicated and far away from a one-liner. Or is this not possible in python (matplotlib)?
Step 1. Prepare your data
df = pd.DataFrame(
{
"seasons":["Winter", "Winter", "Winter", "Spring", "Spring", "Spring", "Summer", "Summer", "Summer", "Fall", "Fall", "Fall"],
"feelings":["Cold", "Cold", "Cold", "Warm", "Warm", "Cold", "Warm", "Warm", "Warm", "Warm", "Cold", "Cold"]
}
)
df_new = df.pivot_table(columns="seasons", index="feelings", aggfunc=len, fill_value=0).T.apply(lambda x: x/sum(x), axis=1)
df_new
feelings Cold Warm
seasons
Fall 0.666667 0.333333
Spring 0.333333 0.666667
Summer 0.000000 1.000000
Winter 1.000000 0.000000
Step 2. Plot your data
ax = df_new.plot.bar(stacked=True)
ax.set_xticklabels(ax.get_xticklabels(), rotation=0)
plt.style.use('ggplot')
plt.legend(loc='center left', bbox_to_anchor=(1.0, 0.5), title="feelings", framealpha=0);
If you’re not wed to matplotlib and really prefer ggplot, then you can just use the plotnine
library which is a ggplot clone in Python. The syntax is near identical:
import pandas as pd
from plotnine import *
survey = pd.DataFrame({
'seasons': ['Winter', 'Winter', 'Winter', 'Spring', 'Spring', 'Spring', 'Summer', 'Summer', 'Summer', 'Fall', 'Fall', 'Fall'],
'feelings': ['Cold', 'Cold', 'Cold', 'Warm', 'Warm', 'Cold', 'Warm', 'Warm', 'Warm', 'Warm', 'Cold', 'Cold'],
})
(
ggplot(survey, aes(x='seasons', fill='feelings'))
+ geom_bar(position = 'fill')
)
As is the output:
Here is a one-liner using seaborn (based on matplotlib and can be further customized with matplotlib calls).
import pandas as pd
import seaborn as sns
df = pd.DataFrame(
{
"seasons":["Winter", "Winter", "Winter", "Spring", "Spring", "Spring", "Summer", "Summer", "Summer", "Fall", "Fall", "Fall"],
"feelings":["Cold", "Cold", "Cold", "Warm", "Warm", "Cold", "Warm", "Warm", "Warm", "Warm", "Cold", "Cold"]
}
)
sns.histplot(df, x="seasons", hue="feelings", multiple="fill")
As a previous R person that now uses mostly python, I had also been searching for an easier way to do this without relying on plotnine
and instead using plotting libraries more native to the python community.
Hope this helps!
Packages used:
- pandas==1.5.2
- seaborn==0.12.2